**Outstanding Contributions to Logic 29**

# Thomas Piecha Kai F. Wehmeier Editors

# Peter Schroeder-Heister on Proof-Theoretic Semantics

# **Outstanding Contributions to Logic**

Volume 29

#### **Editor-in-Chief**

Sven Ove Hansson, Division of Philosophy, KTH Royal Institute of Technology, Stockholm, Sweden

Outstanding Contributions to Logic puts focus on important advances in modern logical research. Each volume is devoted to a major contribution by an eminent logician. The series will cover contributions to logic broadly conceived, including philosophical and mathematical logic, logic in computer science, and the application of logic in linguistics, economics, psychology, and other specialized areas of study.

A typical volume of Outstanding Contributions to Logic contains:


Outstanding Contributions to Logic is published by Springer as part of the Studia Logica Library.

This book series, is also a sister series to Trends in Logic and Logic in Asia: Studia Logica Library. All books are published simultaneously in print and online. This book series is indexed in SCOPUS.

Proposals for new volumes are welcome. They should be sent to the editor-in-chief sven-ove.hansson@abe.kth.se.

Thomas Piecha · Kai F. Wehmeier Editors

# Peter Schroeder-Heister on Proof-Theoretic Semantics

*Editors* Thomas Piecha University of Tübingen Tübingen, Baden-Württemberg, Germany

Kai F. Wehmeier University of California Irvine, CA, USA

ISSN 2211-2758 ISSN 2211-2766 (electronic) Outstanding Contributions to Logic ISBN 978-3-031-50980-3 ISBN 978-3-031-50981-0 (eBook) https://doi.org/10.1007/978-3-031-50981-0

© The Editor(s) (if applicable) and The Author(s) 2024. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this book or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cover photograph by Christoph Jäckle, 2016.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Paper in this product is recyclable.

## **Preface**

Peter Schroeder-Heister began his academic career in 1971 as a student of mathematics and catholic theology — the latter of which would soon be replaced by philosophy at the University of Bonn. There he was almost immediately drawn into mathematical logic by way of Gisbert Hasenjaeger's *Seminar für Logik und Grundlagenforschung*. After passing his *Staatsexamen* in philosophy and mathematics in 1977, Peter moved to the then still very new University of Konstanz, where he wrote his doctoral dissertation while editing, with Gereon Wolters, the academic estate of Hugo Dingler, and later working on Jürgen Mittelstraß's *Enzyklopädie Philosophie und Wissenschaftstheorie*. He obtained his Ph.D. in 1981 from the University of Bonn, Hasenjaeger serving as his *de iure* and Dag Prawitz as his *de facto* thesis adviser. Peter remained in Konstanz as Mittelstraß's *Assistent* until 1989, obtaining his habilitation on the way in 1987, before being appointed to a professorship in logic and philosophy of language at the University of Tübingen, where he would spend the remainder of his academic career, despite prestigious ofers elsewhere, including from the University of Salzburg in 2000.

This volume refects but one of Peter's research interests, though arguably his principal one and the one through which he has made the most signifcant impact: proof-theoretic semantics. It thus seems imperative to mention, in this preface, some of the other areas of intellectual endeavor on which he has left a mark, which include the history of logic (with publications on Dingler, Hertz, Frege, and Popper, not to mention Aristotle and Leibniz, but also as a long-time member of the editorial board for *History and Philosophy of Logic*), logic programming, Martin-Löf type theory, the philosophy of science, substructural and relevant logic, and empirical psychology.

Finally, the editors would each like to say just a few personal words about their relationship with Peter. TP met Peter during his studies at the University of Tübingen, where he participated with great joy in his very instructive courses in computer science and philosophy, frst as a student and soon after also as a teaching assistant. He continued to work in Peter's group for many years and benefted enormously from Peter's constant support and the stimulating work environment, which, in particular, enabled TP to contribute to several international research projects, conferences organized in Tübingen, and joint publications with Peter. KW originally knew of

Peter as a scholar of Frege, through his articles on the permutation argument in §10 of *Grundgesetze* and on Frege's anticipation of propositional resolution, and had corresponded with him about this work before ever meeting him. In 2001, Peter literally rescued KW from having to leave academia by ofering him a temporary position in Tübingen as his *wissenschaftlicher Mitarbeiter*. After one immensely enjoyable and productive year in Tübingen, KW moved on, with much help, encouragement, and both moral and material support from Peter, to more permanent employment with the University of California.

It was thus with great pleasure that we undertook the editing of this volume, and we are delighted at having this opportunity to express our appreciation of and gratitude to Peter. It only remains for us to thank the authors for their inspiring contributions.

Tübingen/Irvine, March 2, 2023 Thomas Piecha

Kai F. Wehmeier

## **Contents**



## **List of Contributors**

Michael Arndt Department of Computer Science, University of Tübingen, Germany, e-mail: arndt@cs.uni-tuebingen.de

Michael Bärtschi Institute of Computer Science, University of Bern, Switzerland, e-mail: michael.baertschi@inf.unibe.ch

Wagner de Campos Sanz Faculdade de Filosofa, Universidade Federal de Goiás, Goiânia, Brazil, e-mail: wsanz@ufg.br

Nissim Francez Technion – Israel Institute of Technology, Haifa, Israel, e-mail: francez@cs.technion.ac.il

Edward Hermann Haeusler Informatica, PUC-Rio, Rio de Janeiro, Brazil, e-mail: hermann@inf.puc-rio.br

Lars Hallnäs The Swedish School of Textiles, University of Borås, Sweden, e-mail: lars.hallnas@hb.se

Andrzej Indrzejczak Department of Logic, University of Łódź, Poland, e-mail: andrzej.indrzejczak@flhist.uni.lodz.pl

Gerhard Jäger Institute of Computer Science, University of Bern, Switzerland, e-mail: gerhard.jaeger@inf.unibe.ch

Reinhard Kahle Carl Friedrich von Weizsäcker Center, University of Tübingen, Germany, and CMA, FCT, Universidade Nova de Lisboa, Portugal, e-mail: reinhard.kahle@uni-tuebingen.de

Michael Kaminski Technion – Israel Institute of Technology, Haifa, Israel, e-mail: kaminski@cs.technion.ac.il

Chuck Liang

Department of Computer Science, Hofstra University, Hempstead, NY, United States of America, e-mail: cscccl@hofstra.edu

Dale Miller Inria & LIX/Ecole Polytechnique, Palaiseau, France, e-mail: dale.miller@inria.fr

Victor Nascimento Filosofa, UERJ, Rio de Janeiro, Brazil, e-mail: victorluisbn@gmail.com

Luiz Carlos Pereira Filosofa, PUC-Rio/UERJ/CNPq, Rio de Janeiro, Brazil, e-mail: luiz@inf.puc-rio.br

Paolo Pistone Department of Computer Science and Engineering, University of Bologna, Italy, e-mail: paolo.pistone@uniroma3.it

Dag Prawitz University of Stockholm, Sweden, e-mail: dag.prawitz@philosophy.su.se

Paulo Guilherme Santos Centro de Matemática e Aplicações, NOVA School of Science and Technology, Universidade Nova de Lisboa, Portugal, e-mail: pgd.santos@campus.fct.unl.pt

Peter Schroeder-Heister Department of Computer Science, University of Tübingen, Germany, e-mail: psh@uni-tuebingen.de

Göran Sundholm Institute for Philosophy, Leiden University, The Netherlands, e-mail: goran.sundholm@gmail.com

Neil Tennant Department of Philosophy, The Ohio State University, Columbus, OH, United States of America, e-mail: tennant.9@osu.edu

Luca Tranchini Department of Computer Science, University of Tübingen, Germany, e-mail: luca.tranchini@gmail.com

Heinrich Wansing Department of Philosophy I, Ruhr University Bochum, Germany, e-mail: Heinrich.Wansing@rub.de

Bartosz Więckowski Institut für Philosophie, Goethe-Universität Frankfurt am Main, Germany, e-mail: wieckowski@em.uni-frankfurt.de

## **Proof-Theoretic Semantics: An Autobiographical Survey**

Peter Schroeder-Heister

**Abstract** In this autobiographical sketch, which is followed by a bibliography of my writings, I try to relate my intellectual development to problems, ideas and results in proof-theoretic semantics on which I have worked and to which I have contributed.

#### **1 The term: Proof-theoretic semantics**

Proof-theoretic semantics, from various perspectives, has been a predominant occupation for me since my doctorate. The feld itself was already in existence prior to any contribution of mine. It was created by Gerhard Gentzen by designing a logical calculus representing the "natural" way of deductive reasoning, and by claiming that its special features — in particular its way of handling assumptions, and its classifcation of inference rules into rules for the introduction and the elimination of logical symbols — give certain rules a 'defnitional' status, thus equipping logical symbols with their meaning (Gentzen, 1935). It was only consequential that the German logician Franz von Kutschera called this approach "Gentzen semantics" (von Kutschera, 1968) and thus created a term in analogy to "Tarski semantics" which is often used for the dominant approach to denotational semantics established by Alfred Tarski. Similarly, I came up with the term "proof-theoretic semantics" as a systematic term in analogy to "model-theoretic semantics", which emanated from what Tarski had put forward.1 In 1985, I frst mentioned it as the title of a planned book in a letter to Dag Prawitz. Later I used the term in lectures I gave in Stockholm. It appeared in

Peter Schroeder-Heister

Department of Computer Science, University of Tübingen, Germany, e-mail: psh@uni-tuebingen.de

<sup>1</sup> For further refections on the idea of proof-theoretic semantics see my *Stanford Encyclopedia of Philosophy* entry (E2012c) and, from slightly diferent perspectives, Wansing (2000) and Francez (2015).

print, I think, frst in 1991 in an abstract (A1991e)2. Nowadays it seems a 'natural' term, which has become a standard designation for a certain feld of study and is often used without much consideration of its conceptual origin; something it shares with Robert Brandom's term "inferentialism" (Brandom, 1994, 2000). It suggests that we are talking about a semantics that is based on, or crucially uses the notion of "proof" or "proof theory".

#### **2 Academic roots: Hasenjaeger's institute**

I discovered logic relatively early. At the University of Bonn, where I started studying in 1971, I came into contact with the *Institute for Logic and Foundational Research*3 during my frst year of study, even though it had nothing to do with my ofcial subjects, which at that time were Catholic theology and mathematics. The head of this institute was Gisbert Hasenjaeger (M2006c, Wirth, 2021), who, in the 1940s and 1950s, had studied, worked and published with Heinrich Scholz in Münster. He had modelled his institute on Scholz's "Institute for Mathematical Logic and Foundational Research", which was the frst institute of its kind in Germany. Hasenjaeger — who, incidentally, was a sort of counterpart to Turing4 — had a very liberal attitude towards what logic should be, so he did not follow a particular strand of logic exclusively. This attitude also applied to those whom he admitted to his group's seminars. He did not mind that I, as an undergraduate, attended seminars which were ofcially intended for graduates only. As early as late 1972, for a seminar directed by Wolfram Schwabhäuser in my third semester of study, I quite enthusiastically produced a detailed exposition of Gödel's incompleteness theorems. In that essay of nearly 50 pages I even included Y. Matiyasevich's result from two years prior that every recursively enumerable predicate is diophantine, from which the (negative) solution of Hilbert's tenth problem directly follows. Alexander Prestel, who later, as a professor in Konstanz, was to become instrumental for my career, had just completed his Habilitation in Bonn and, following his inaugural lecture on this topic, left me his notes. I had become acquainted with the issues of incompleteness and undecidability through Wolfgang Stegmüller's excellent book of 1959, on which I must have hit during my frst semester of study, or even earlier. I remember Hasenjaeger, during one of my presentations, correcting my pronunciation of the name "Kleene", which I pronounced like the word "clean", as I had not listened to any logic lectures before and knew the name only from printed

<sup>2</sup> References of the form A⟨year⟩, P⟨year⟩, C⟨year⟩, D⟨year⟩, R⟨year⟩, E⟨year⟩, M⟨year⟩ refer to the corresponding sections of the bibliography at the end of this article.

<sup>3</sup> "Seminar für Logik und Grundlagenforschung"

<sup>4</sup> After being severely wounded in the war, he was, as a gifted mathematician (who had not yet started university studies), appointed to the cryptology department of the German military and tasked with testing the cryptological reliability of the Enigma machine. He was not able to discover those weaknesses that allowed Alan Turing and Gordon Welchman (based on ideas of Marian Rejewski and other Polish cryptologists) to decrypt the code it generated — "fortunately", as he said after the war, as otherwise the war might have lasted even longer.

sources. In 1973, still in my second year at Bonn, I wrote another essay on Gödel's functional interpretation of arithmetic (the *Dialectica* interpretation), which meant that by this point I had built up a signifcant background in logic. However, I regret that I did not take the chance of learning set theory or model theory in more depth. I would have had a perfect chance to do so, as besides Prestel, Keith Devlin, Ronald Björn Jensen, Sabine Koppelberg and Wolfram Schwabhäuser were all at the institute at the time. In any case, there was a spirit at the department that motivated young researchers like myself.

There was also funding available for students to attend meetings in mathematical logic; much of it from the Volkswagen Foundation. Already in early April of 1973, in my second undergraduate year, I had the chance to visit such a meeting in Tübingen, organized by my future colleague Walter Felscher, who had just become a professor of mathematical logic there. Solomon Feferman gave a two-day course on advanced proof theory, and I was able to meet quite a prominent (or soon-to-be prominent) section of the German logic scene including Wilfried Buchholz, Justus Diller, Ulrich Felgner, Wolfgang Maas, Gert Müller, Helmut Pfeifer, Wolfram Pohlers, Kurt Schütte and Helmut Schwichtenberg. Very impressive was a week in January 1974 at the Mathematical Research Institute in Oberwolfach, where I had been invited to a meeting on set theory and model theory led by Felgner (another future Tübingen colleague), where many outstanding German logicians were present. Of the other conferences or schools that I was able to attend as an undergraduate, I remember one in Münster on intuitionistic logic with courses given by Anne Troelstra about choice sequences and Dirk van Dalen about intuitionistic logic in general.

With other students at the Hasenjaeger institute, notably Benedikt Peppinghaus and Hans Leiß, we embarked on all kinds of interesting ventures. Benedikt, for example, invited Eduard Wette, who was considered by many an enfant terrible of mathematical logic5, to present his purported system-internal consistency proof of arithmetic which, by Gödel's second incompleteness theorem, would prove the inconsistency of arithmetic. I found this extremely stimulating, even though there were gaps in the proof he presented (as one might have expected). Again, it speaks for Hasenjaeger's liberal attitude that he permitted such things to take place. It is obvious that all this could pull somebody like me in the direction of proof theory and constructive logic. However, logic was only a part of my student life in Bonn between 1971 and 1977 (see Section 4), and I did not acquire any university degree in logic until my doctorate.

#### **3 Family roots: Upbringing and school**

I do not have an academic family background. My mother came from the household of a primary school teacher and was very well-educated despite the fact that due to the difcult circumstances of her time (including gender-based disadvantages) and family

<sup>5</sup> For a fair characterization of Wette see Paul Bernays's letter to Kurt Gödel of 24 January 1975 (Gödel, 2003).

obligations she was unable to attend high school. My father, who, like my mother, was well-educated and had a talent for mathematics, ran a business that produced stationery (exercise books, writing pads etc.), which he had taken over from his father. This prevented him from going to university after returning from America, where he had been a prisoner of war. Despite it not being his frst-choice career, he was a very successful businessman. When he sold the company in the 1980s, he was able to establish, from the profts, a charitable foundation that built and ran an innovative local carehome. In our town of Düren6, where I was born on 2 March 1953 as the eldest of four children, I attended the "humanist" high school with its emphasis on Greek and Latin, at the expense of modern languages and natural sciences, though provided with an excellent education in mathematics.

At school, I was always good at mathematics, and it was always assumed I would study mathematics at university (which I did). I otherwise became interested in philosophical and theological questions when I was ffteen or so, partly through the infuence of a Catholic youth organization and its local leaders7, who were pretty left-wing (it was 1968, after all). Through a friend — Lothar Stresius, who was four years older, a theology student at Bonn and Tübingen8, and in many respects a role model — I had the chance, while still a high school student, to attend lectures in Tübingen, I think in 1970, by many famous German theologians of the time. These included Eberhard Jüngel (20 years later my colleague in the Department of Philosophy), Ernst Käsemann, Walter Kasper (today cardinal of the Roman Curia), Hans Küng and Jürgen Moltmann (Joseph Ratzinger — later pope Benedict XVI — had left Tübingen the year before). All this impressed me so much that, after my high school diploma in 1971, I enrolled in Bonn in Catholic theology, in addition to mathematics. After one year I replaced theology with philosophy, which I had been studying as part of the theology course, and which also interested me deeply.

#### **4 Bonn 1971–1977: Undergraduate study**

**Philosophy and mathematics.** My interest in philosophy dates back to when I was 16 or 17 and read, in addition to the fashionable philosophical literature of the time (Frankfurt school, especially Adorno and Horkheimer), some Popper (my earliest copy of the *Logik der Forschung* dates from 1970) and also some Kant. I was introduced to the latter by my excellent high school philosophy teacher Hubert Fackeldey, who later became a professor in Cologne and made signifcant contributions to philosophical aspects of deontic logic.

Philosophy in Bonn was a traditional department covering the whole history of the discipline from antiquity to modern times, where modern times meant essentially

<sup>6</sup> To mention its mathematical connection: Peter Gustav Lejeune Dirichlet (1805–1859), who propagated the notion of a mathematical function as an abstract (that is, operation-independent) mapping, was born there.

<sup>7</sup> Especially Paul Georg Meyer, who later became a linguistics professor at RWTH Aachen.

<sup>8</sup> Later, after a Ph.D. on Adorno, he became high school principal.

hermeneutics and Heidegger, but also some Wittgenstein and analytical philosophy. Even the normally underrepresented (if present at all) medieval philosophy was very strong. Most impressive in philosophy to me was the approach of the Kantian Gerold Prauss, who was an associate professor who later went to Cologne and then to Freiburg. He was working on a novel interpretation of Kant's concept of "thing-in-itself" and had just completed a book manuscript on it (Prauss, 1974). Although he did not know me he had so much trust in me that he lent me a carbon copy of it (not a photocopy — this technique was just about to enter wider use), and at some stage, because he unexpectedly needed it for the publisher, had to fnd out where I was living to have it returned.

Prauss later also supervised the philosophy part of my university degree, which I completed in 1977, writing a thesis on the concept of truth in natural languages (originally I had even considered writing something about Heidegger). Unlike its title might suggest from the modern point of view, it was a piece in philosophy of language in the Kantian spirit, which had nothing to do with formal natural language semantics. My fnal university degree was actually a combined teachers' degree in philosophy and mathematics. Like most humanities students at the time, I chose the teachers' degree simply because this gave one the possibility to enter the high school teaching profession if anything went wrong with the Ph.D.

Through Prauss I also got in contact with Günter Buhl, who had written an excellent dissertation on consequence and grounding in Bolzano (Buhl, 1961), the outstanding quality of which I only appreciated much later when working in prooftheoretic semantics, as he was the frst to fully recognize the proof-theoretic aspects in Bolzano's work. Buhl taught elementary logic in the philosophy department using a system of natural deduction. He hired me as a student assistant, and I was able to give tutorials in formal logic, an experience that shaped all my later teaching of logic to philosophers.

I spent my student time until graduation essentially in Bonn, but also as a "guest student" in Cologne and Aachen. The initial choice of Aachen was chiefy due to the fact that my girlfriend Gabi was a student there. In Aachen, I attended seminars run by Christian Thiel, who later became the successor to Paul Lorenzen in Erlangen. I learned a lot from him. He drew my attention to Lorenzen's *Operative Logic* of 1955, which from today's perspective must be considered one of the milestones of proof-theoretic semantics9, even though Lorenzen himself abandoned his operative approach in favour of dialogical semantics in the 1970s, something also appreciated by Thiel. Dialogical logic was a topic I became strongly interested in myself — much later we even had a research project on it (see Section 11). Outstanding was, of course, Thiel's knowledge and mastership as far as Gottlob Frege's work was concerned. In the German-speaking world he had initiated, through his doctoral dissertation on sense and denotation in Frege's logic (Thiel, 1965), the revival of interest in Frege's work and played the role which in the English-speaking world was taken by Michael Dummett and his monograph on Frege's philosophy of language (Dummett, 1973).

<sup>9</sup> See my later papers A2007b, A2007c, A2008a.

**Music and musicology.** I had a modest background in musical performance (piano) and therefore a corresponding interest in music and its theory. For three semesters I studied musicology in Bonn as an additional subject, which was possible at that time. Of the many courses I took, most stimulating were theories of harmony, which was only natural given my mathematical inclinations. A lasting impression on me was made by Martin Vogel, who had developed a systematic approach to the topic, which very much difered from the historical attitude typical for German musicology. Together with a small group of students, some of whom came from mathematics, he developed a theory of harmony based on pure tuning, mathematically modelled as (an extension of) Leonhard Euler's web of tones and based on approaches by the physicists Hermann von Helmholtz and Arthur von Oettingen (Vogel, 1975). Had I continued musicology, I probably would have followed this path of study myself, in particular as my later occupation with computer science would have presented the right environment and interesting practical applications for it.

**Private life.** The summer 1974 changed my life, when I met Gabi (Gabriele Heister), who was to become my wife fve years later. Gabi was studying in Aachen for two degrees at the same time, one in psychology and one joint degree in German and education. I met her on a holiday trip to the Netherlands in a larger group of students. Apart from sharing my interest in music (she had at one point trained as a soprano), she also infuenced my general scientifc views. Whereas I was very much a humanities scholar (including logic and mathematics), she had a thorough grasp of empirical research. Moreover, through experimental psychology she was concerned with statistical analysis of data, which was completely new to me, even though I knew some probability theory and mathematical statistics. Discussing these topics with Gabi broadened my intellectual views signifcantly. Drawing up hypotheses, looking at the outcome of experiments, and using statistical methods in the evaluation of data — the absolutely normal approach in the sciences — was something I was not used to.

#### **5 Konstanz 1978–1981: Encyclopedia and doctorate**

**What next?** When I had completed my teacher's degree in May 1977, the question was: what to do now. I wanted to do a doctorate but I was not keen to continue with the topic of my undergraduate thesis, and apart from that there was no job in sight to fnance it. As it turned out, Gabi, who had completed her diploma in psychology around the same time, had been ofered several positions, including one at the University of Konstanz for the year. With no other options for me in sight, we decided to go. In November 1977 we found a fat in Allensbach near Konstanz, and in December we moved. For Gabi the working conditions were less than ideal, to put it mildly. For me, the move turned out to be one of the best decisions of my life. There was no job initially, but that was soon to change.

**Gereon Wolters, the Dingler Nachlass, and the Encyclopedia.** The opportunity to start an academic career in Konstanz I owe frst to Gereon Wolters and then to

Jürgen Mittelstraß. In early 1978, I walked into Wolters's ofce. He had completed his Ph.D. the year before and was afliated with Mittelstraß, who held one of Konstanz's three chairs in philosophy. It turned out that Wolters had just successfully applied for a grant to make the Nachlass (literary estate) of the philosopher Hugo Dingler accessible. There was a part-time position available for 10 months on this project with no candidate yet nominated. Wolters hired me on the spot. He even gave me an ofce of my own, the window of which ofered a magnifcent view over Lake Constance and the Isle of Mainau. As the university was still under development, it could be relatively generous with respect to ofce space. My frst publication resulted from this job: a bibliography of the works of Hugo Dingler (A1981a). It was a piece of meticulous work, of which I am still proud.

Through Wolters, I joined the team assisting Jürgen Mittelstraß in his editorial work on the *Encyclopedia of Philosophy and Philosophy of Science* (1980–2018), a voluminous work that he had started some years ago, with the frst volume covering the letters A–G due to be completed in spring 1979. A logician with interests in adjacent felds was a very welcome addition to the team. When in 1979 there was a two-year position available, I received a part-time appointment to work on the Encyclopedia, being free to work on my dissertation during the rest of my time. Over the years (the eighth and fnal volume of the second edition appeared in 2018, the year before my retirement) I wrote around 200 articles for it (E1980–2018); tiny ones such as *a* (the letter mnemotechnically representing universal afrmative judgements in traditional syllogistics), but also longer ones such as *Popper, Karl Raimund* (E1995a). We also had a lot of fun creating a number of fctitious entries ('Nihilartikel', 'mountweazels') — a long tradition in encyclopedic works, which reached a new level of sophistication under Mittelstraß.

**Linguistics and mathematics.** In Bonn I had come into contact with logic almost exclusively from the mathematical perspective. As to philosophical logic, I was only acquainted with topics relating to the foundations of mathematics, such as the dispute about classical versus intuitionistic logic or the discussion of the logical and set-theoretical paradoxes. I was not aware of extra-mathematical applications of logic.

Making interdisciplinary contacts was very easy in Konstanz. All disciplines were under the same roof in one huge building complex, where one would frequently run into people from other departments in the corridor, in the cafeteria and in the large entrance lounge of the compound. People in general linguistics were natural partners to whom one would speak. In Konstanz, general linguistics had a strong leaning towards logic-based semantics. It was through this that I engaged more seriously with Montague grammar10, higher-level linguistic type theory and similar issues. I discovered that general linguists were all doing logic in a very original fashion and at an advanced level that went beyond the sort of mathematical logic I was used to. This was essentially due to the intensional phenomena one has to deal with in linguistics and that do not play a role in mathematics, which is essentially (though not throughout) extensional. My main contacts on the professorial side were Urs Egli and

<sup>10</sup> Had I followed Hans Leiß's initiative, I would have studied this subject already together with him in Bonn.

Arnim von Stechow (who later became my colleague in Tübingen), and at the level of the postdocs Thomas Ede Zimmermann (who was later in Tübingen and Stuttgart, and then became professor in Frankfurt) and Wolfgang Sternefeld (later a colleague in Tübingen). I learned a great deal that I would probably never have encountered at a less communicative university with a more conventional department structure.

Konstanz had a strong mathematics department, and also a strong group in mathematical logic, headed by Alexander Prestel, whom I knew from Bonn. Prestel was a set theorist and model theorist and held a weekly seminar which I regularly attended. A special role was played by Ulf Friedrichsdorf, a permanent member of Prestel's group. He was a hardcore mathematical logician, but with an incredibly wide range of interests — he co-authored textbooks on model theory and set theory with Prestel which extended in particular into philosophical logic and into linguistics, exemplifed, for example, by his brilliant textbook on classical and intensional logics (Friedrichsdorf, 1992), which even covered the arithmetical completeness of derivability logic proved by Solovay some time ago. I have frequently used it for teaching. In Konstanz I realized for the frst time what a collaboration between philosophers, mathematicians and linguists could achieve.11

**Teaching.** From 1981 to 1989 I held full-time positions in Konstanz, made possible by Jürgen Mittelstraß. This meant that I was obliged to teach, and in Konstanz I was often able to choose which subjects to teach, which elsewhere would not have been possible at this early stage of my career. I taught logic courses, but also other topics including Frege, Husserl, Carnap, Popper, ontology, epistemology, general philosophy of science, and later also logic programming and automated theorem proving etc. I learned a lot from teaching, and I had quite a number of excellent students. As mentioned above, the architecture of the university encouraged interdisciplinary activities, and there were always students from mathematics, linguistics and other subjects in my courses.

**Marriage and children.** Gabi and I got married in December 1979. This is how I arrived at my double-barrelled surname: "Schroeder-Heister", adding Gabi's surname to my birth name, "Schroeder". As the ability of husbands to take their wives' names had only just been introduced in Germany, the administration was not used to it. The registrar visited us at home after the ceremony to fll in critical signatures they had forgotten at the ceremony. Our children were born a couple of years later. Paula in 1984, and Justin in 1987; both in Münsterlingen, on the Swiss side of the lake. This did not give them Swiss citizenship, however: it is not that easy to become Swiss.

**A Ph.D. without a supervisor.** I took the risk of doing my Ph.D. without a formal supervisor. There was no "natural" supervisor in Konstanz for my feld of interest. In fact, for a long time I did not know what to do contentwise. What would become the fnal topic of my thesis only emerged about twelve to eighteen months before I submitted it. Until then, it was not even clear whether it would be mathematical or philosophical logic, and how strong its philosophical component would be.

I was working in all sorts of directions, many of which I no longer remember.

<sup>11</sup> For the history of logic at Konstanz University see Buldt (2022).

What I do remember is that I somehow hit onto von Kutschera's 1968 article, in which he coined the term "Gentzen semantics" (see Section 1). There he showed, for intuitionistic propositional logic, the expressive completeness of the four standard connectives conjunction, disjunction, implication and absurdity, corresponding to the classical idea of functional completeness. This was established relative to a general schema of rules characterizing arbitrary -ary connectives. Crucial was the idea of introducing some iteration of a sort of structural implication leading to what he called "-formulas", which played a critical role in the interpretation of implication as a connective. I also at the time hit upon Prawitz's 1979 article, which had the same aim of proving expressive completeness. Whereas von Kutschera worked in a sequent-style framework, Prawitz used the apparatus of natural deduction that he had put forward in his monograph (Prawitz, 1965). It turned out that Prawitz's paper contained an error, as he could not give a proper schema for implication without presupposing it. My idea was to combine von Kutschera's idea of -formulas with Prawitz's natural deduction schema. To achieve this, I developed a system of rules of higher levels, that is, of rules depending on rules, and generalized elimination rules using this tool. This provided a general framework to deal with arbitrary connectives. The introduction and elimination rules for logical connectives were understood as telling that connectives express a system of inference rules, which gave the connectives a semantics in terms of rules, thus a sort of proof-theoretic semantics (although I did not yet use this term in my thesis). In the second part of the thesis, following some ideas of von Kutschera (1969) I extended this approach to a system with an operation of denial and corresponding refutation rules, something which nowadays would be called bilateralist, but which I never published afterwards. I developed this very quickly, and my thesis was essentially written in the frst half of 1980.

How to proceed formally towards a Ph.D.? The thesis was very much done in the spirit of Dag Prawitz, whose works I admired, in particular his combination of formal investigations with philosophical interpretation. However, I did not know him personally, and had not seen him at any conferences. In fact, Prawitz to me was somebody of such high status that I did not dare approach him. Help came from Alexander Prestel. He encouraged me to contact Prawitz after all, and actually sent him a photocopy of my thesis draft recommending me to him. With Prestel I also discussed the idea of fnally submitting the thesis in Bonn with Hasenjaeger as examiner, even though he had not been involved with my thesis at all. From his time in Bonn, Prestel knew Hasenjaeger very well and was aware that I had left a good impression on him years ago. The arrangement with Hasenjaeger fnally worked out. After seeing my draft, and in light of my earlier achievements as an undergraduate student, he accepted me. The fnal plan was, in addition to Hasenjaeger as the frst examiner, to get Prawitz appointed as external examiner, which the philosophical faculty agreed to.

The only problem was that I had no reply from Prawitz, who at the time was on a research stay in Italy. Per Martin-Löf, who knew from Prawitz about my thesis draft supported me and encouraged him to respond to my request. I met Martin-Löf for the frst time in Munich in early 1981 at a meeting organized by Helmut Schwichtenberg, where he presented his constructive type theory, and where Peter Aczel presented

his version of intuitionistic set theory. This meeting impressed me enormously, in particular the way in which Martin-Löf presented his system. It was very philosophical; he argued for the rules he was giving from frst principles.

Suddenly, in mid-January 1981, after having heard nothing from Prawitz for months, there came a 25-page handwritten letter, in which he discussed my thesis draft in detail. This was extremely helpful, in particular as he pointed out to me certain problems associated with introducing rules as structural entities, as this essentially meant the duplication of implication. However, in principle he was quite happy with the approach. This allowed me to revise and fnish the thesis very quickly and to formally submit it, so that in May 1981 my oral doctoral exam ("Rigorosum") took place in the form of three examinations on one day, in logic as the main subject, and philosophy and musicology as minor subjects. Due to the existence of Hasenjaeger's institute (Section 2) the formal degree subject of the doctorate was "Logic and Foundational Research" (rather than "Philosophy"). I only met Prawitz personally the year after. Thus, he had been the hidden supervisor of my thesis.

Part of my thesis (A1981b), which itself was written in German, appeared in English in revised form under the title "A natural extension of natural deduction" (A1984a) in the *Journal of Symbolic Logic*12. The title goes back to comments made by Göran Sundholm, who acted as a reviewer for the *JSL*13. It was to become my most-cited paper. Perhaps this confrms the view that for normal researchers, the work for their doctorate is often the most substantial work they ever do, with all the rest being incremental extensions of it. Somebody like Gödel had several diferent and independent fundamental ideas in his lifetime, but most people are no Gödel.

#### **6 Konstanz 1981–1989: Postdoctoral research and Habilitation**

Being free after achieving the Ph.D., I continued my research in various ways on topics that occupy me still today.

**Semantical completeness.** My dissertation gave a proof-theoretic semantics of propositional logics and proved the *expressive completeness* of connectives with respect to this semantics. What I tried in the sequel was to prove Prawitz's completeness conjecture of 1971 (Prawitz, 1971, p. 257), namely that the formalism of intuitionistic logic is *semantically complete* with respect to a certain sort of proof-theoretic semantics. I published two papers on this issue (A1983c, A1985b) with practically no feedback. Only some 35 years later I came back to these issues in joint work with Thomas Piecha, where we showed the limitations of the goal to prove completeness — see below (Section 10). Today, on the background of the modern discussion (see Piecha, 2016), this topic has regained some interest through work by Tor Sandqvist (2015), David Pym and others (2022).

<sup>12</sup> A transfer to quantifer logic appeared in A1984c.

<sup>13</sup> The original paper submitted to the *JSL*, which difers substantially from the fnal printed version, is still worth reading. It is attached to the online version of my thesis A1981b.

**Popper.** One of the outstanding experiences of my intellectual life was meeting with Karl Popper. I had already once been in contact with him when I asked him for possible information about Kurt Grelling, on whom in 1980 I had to write an article for the *Encyclopedia*. I was extremely impressed that he replied immediately to my request with a letter in his beautiful handwriting. Shortly after my doctorate, I came, more or less accidentally, across his papers on deductive logic that he had published in the late 1940s after fnishing his *Open Society* (Popper, 1945). These papers were very diferent from the topics that one would expect from Popper. They had received a mixed reception in the logic community. Even though Popper was one of the leading philosophers of the twentieth century, his contribution to deductive logic remained largely unknown.

I immediately saw the signifcance of these papers for proof-theoretic semantics. They attempted to provide an inferentialist foundation for logic. What I discovered was that Popper's papers provided an original contribution towards the problem of the logicality of sentence operations. I tried to work out this idea in a long paper, which appeared in *History and Philosophy of Logic* (A1984d). A later paper in the *Popper Centenary* volume (A2006b) attempted to change the ignorance regarding Popper's logical theory, as did papers by my students and colleagues David Binder and Thomas Piecha a decade later (2017, 2021). Recently, the three of us completed a volume containing Popper's logical papers as well as manuscripts from his Nachlass (estate) on these topics, including exchanges of letters related to logic, as well as a detailed introduction (C2022b, A2022a)14. We look forward to seeing the reaction from the logic community to Popper's ideas.

In July 1982, I sent the frst version of my manuscript (of A1984d) to Popper. I immediately received a reply, followed by another one with more detailed comments a few days later. This extended into an exchange of letters (M2022). Popper started this exchange with me, somebody who was a complete no-name to him, simply because he was interested in the topic and in what I had done on it. Some time later Gabi and I met Popper in person. We were staying in Edinburgh, where I had a fellowship at the Institute for Advanced Studies in the Humanities for three months, and Popper was the keynote speaker at a meeting in Leicester in spring 1983 that we attended. He greeted us there, and I had the chance to speak with him about the paper. Soon after we had another exchange of letters on inductive probability and his joint paper with David Miller (Popper and Miller, 1983)15. Years later, after a lecture in St. Gallen in June 1989, with his characteristic sense of humour, he greeted me with "Ach, Herr Schroeder, ich hätte Sie fast nicht erkannt, Sie sind aber dick geworden" ("Ah, Mr Schroeder, you have put on weight — I almost didn't recognize you").

My interest in the foundations of probability theory and the notion of randomness that I had maintained since my mathematical studies in Bonn was also, in part, due to Popper. The chapter on probability in the *Logik der Forschung* presented a frequentist defnition of probability which was highly original and, being a defnition of randomness for fnite sequences, anticipated certain aspects of A. N. Kolmogorov's

<sup>14</sup> For a summary of (our view of) Popper's ideas on deduction see Piecha (2023).

<sup>15</sup> In Popper and Miller (1987), they explicitly acknowledged comments of mine on the possible dualization of their argument, which was also quite rewarding at this stage of my career.

later defnition in terms of algorithmic complexity. I studied some of these theories, in particular by R. von Mises, A. Wald, Martin-Löf, C. P. Schnorr and Kolmogorov, but have never been able to turn this into a research topic — too many excellent researchers were already involved in it, and I was fully occupied with other issues. However, a presentation of Popper's approach in a collection of essays on the *Logik der Forschung* resulted from it (A1998b).

**Psychology.** Gabi completed her Ph.D. in experimental neuropsychology in 1985. Through following Gabi's work, I found this type of research more and more interesting and gradually started looking into it myself. Her experiments were based on reaction times upon tachistoscopically presented stimuli. She realized that work in general psychology in the area of perception played a signifcant role in the efects observed, and her work shifted somewhat towards that realm. She became interested in spatial stimulus-response compatibility: the phenomenon that in most situations spatial features of a stimulus correspond to spacial features of the response in terms of shorter reaction times. This fnding allows certain insights into the way spatial information is processed and how cognitive processing is organized in the brain. My contribution went slightly (but only slightly) beyond statistical evaluation and towards the discussion and the theoretical modelling of results, which is refected in the fact that I made frst author in one of around ten papers with which I was involved (P1988).16 This work absorbed me for quite some time, as it was so interesting, even though it distracted me from proof-theoretic semantics. It ftted my interest in cognitive science, nevertheless.

**Philosophy of science.** The 1970s and 1980s were the heydays of philosophy of science, in which general topics such as the status of theoretical terms and grand themes such as normal versus revolutionary science were discussed. A particularly important topic was the thesis of the incommensurability of theoretical terms in the case of revolutionary science, put forward by T. S. Kuhn. In Germany J. D. Sneed's model-theoretic unterstanding of scientifc reasoning was prominent, especially through Stegmüller's (1979) propagation of it. Every logically inclined philosopher would have studied these approaches at the time. In my case this led to a paper in *Philosophy of Science* (A1989) which argued, by using model-theoretic means, that reducibility and incommensurability of theories are not necessarily incompatible. It was to become my only model-theoretic publication and used the expertise of my co-author Frank Schaefer, who was studying model theory with Prestel.

**Frege.** Something that has always intrigued me was the philosophy and logic of Gottlob Frege. I had come across it at a very early stage of my studies — my copy of the *Grundgesetze der Arithmetik* dates from 1971, my frst semester of study. However, practically all interesting questions I have dealt with in various papers were triggered by remarks or exchanges with Christian Thiel, partly by remarks in discussions with him, partly by his many publications on Frege and Fregean themes.

The frst paper of this kind was on Section 10 of the *Grundgesetze* (A1987a),

<sup>16</sup> See subsection P of the bibliography. Sadly, our co-author of many papers, Walter Ehrenstein, died in 2009, aged only 58 (Paramei, 2009).

which I also presentend at a Frege conference in Schwerin in 1984 (A1984b). At this conference I had the chance to meet almost the whole logic community of the German Democratic Republic (GDR), but also many people from elsewhere. Doing logic in the GDR was a way to avoid the ideological pressure exerted on other branches of philosophy. Consequently, in communist East Germany, logic was better represented within philosophy than in West Germany. In the paper I tried to reconstruct Frege's arguments of whether and which courses-of-values could be chosen to function as truth values, which is fundamental for the ontology of the *Grundgesetze*. At that time I was quite satisfed with the paper, and later even more when it occurred amongst the very few quotations Dummett made in his book on Frege's philosophy of mathematics (Dummett, 1991). Initiated by Kai Wehmeier, the topic of this paper later received a new rethinking in a joint paper (A2005), which we then dedicated to Thiel.

When looking at Frege's notation and terminology in the *Grundgesetze* I realized that in a certain way this resembles very much the structure of Gentzen sequents and the terminology applied there. I wrote a paper on this but realized just when it was fnished that von Kutschera, to whom I had sent it, had independently had very similar ideas17. However, for another paper on Frege I claim full originality (A1997). In it I interpreted his propositional calculus from the *Grundgesetze*, which he was applying to solve a problem discussed by G. Boole, E. Schröder, W. Wundt and H. Lotze, as a system of propositional resolution. I found it very intriguing that systems with the cut rule (plus substitution) as the only inference rule, which have become prominent in automated theorem proving, could be traced back to Frege, at least for the propositional fragment. Needless to say Frege has been a frequent topic of my teaching.

**Historical foundations of logic.** Frege is the greatest fgure as far as the foundations of modern mathematical and philosophical logic are concerned. Two other great fgures in the (occidental) history of logic before were Aristotle and Leibniz. I was interested in both of them and have taught and written on both. On Leibniz I wrote a paper with Mittelstraß concerning his arithmetical calculus (A1986), towards which I had been directed by Thiel, and which can be seen as a model-theoretic semantics in terms of pairs of coprime natural numbers. This is closely related to Leibniz's programme of a *Characteristica Universalis* that would code concepts by arithmetical means and allow one to decide the validity of inferences by calculation, sometimes seen as the precursor of modern AI. The paper also contains remarks on the concept of probability in Leibniz as the degree of possibility. On Aristotle I have a paper (A2008), with Mittelstraß also, which discusses, in a certain context, his modal syllogistics and also the signifcance of the fourth fgure of traditional syllogistics (not yet present in Aristotle). I emphasized in particular a result by Daniel Merrill (whom I knew already from my work on Popper's logic), which showed that in the presence of obversion, every syllogism (even an invalid one) can be reduced to the fourth fgure, which is not possible for the other fgures (to which, as Leibniz was already aware, every valid syllogism can be reduced). I also studied the work of Paul Hertz, a predecessor of

<sup>17</sup> Von Kutschera (1996). Though presenting them on various occasions (e.g. A1999), I published my results only much later (A2014a), of course with proper acknowledgement to von Kutschera.

Gentzen, who developed a system of purely structural reasoning, whose inference steps can be viewed as applications of the resolution rule, and whose proofs can be put into certain normal forms (A2002b). This is highly signifcant for proof-theoretic semantics, as this system is the starting point of Gentzen's research and the topic of his frst publication (1933). His iteration of structural implications could remind one of my own higher-level rules. If I remember correctly, again it was Thiel who had drawn my attention to Hertz. Michael Arndt (D2008) continued and further advanced this work on Hertz.

**Logic programming and computer science.** Around the time of my doctorate computing took of for the general public. My doctoral thesis was still written using a typewriter with the formulas inserted by hand and larger corrections made with scissors and sticky tape, thus by cut and paste in the literal sense. I had already come across advanced computing through the linguists, who owned a Lisp machine. I do not remember which brand it was. In any case, it was a revelation to see the graphics display operated with a mouse, and the large 8-inch foppies to store programs and data. Through them I also acquired the reference to PROLOG and logic programming, and, following that, I hit on J. W. Lloyd's 1984 book on its foundations, which had just come out.

After reading Lloyd's book, I realized that logic programming, when understood proof-theoretically, had very much to do with what I myself was doing. In particular, my own framework with rules as assumptions ofered not only a neat interpretation of logic programming, but even allowed for an extension of logic programming using embedded implications. These things were in the air, with people such as Dov Gabbay and Uwe Reyle (1984) as well as Dale Miller (1986) doing similar things.

Beginning with Schwichtenberg's Munich conference of 1981 (see Section 5) and through my contacts to Stockholm, I also became involved with the type-theory oriented programming community, which did not have much overlap with the logic programming community, apart from people like myself, Lars Hallnäs and Dale Miller. This community was much nearer to functional programming which has often tended to distance itself from other paradigms. People in Edinburgh such as Gordon Plotkin and the *Logical Framework* group realized that my higher-level rules and the generalized elimination rules for logical operators seemed to be of considerable computational use for the implementation of logic systems, to the efect that I was invited to conferences there with many prominent computer scientists and computational logicians present. My contribution to Gérard Huet's and Plotkin's volume on logical frameworks (A1991d), in which, by reference to substructural logics, I gave the general elimination inferences the status of a kind of structural rule, resulted from this. I also made contact with colleagues in Cambridge; I remember talks in Larry Paulson's and in Martin Hyland's seminar18. My strong interest in Martin-Löf type theory and the fact that I could apply my methodology to it (A1989a), kept me in lasting contact with the Programming Methodology Group in Gothenburg, quite independent of the very many and regular encounters and personal discussions with Per Martin-Löf himself over the decades.

<sup>18</sup> And in Paulson's ofce a young French postdoc, Thierry Coquand.

All this coincided with me developing an interest in cognitive science and artifcial intelligence, not only from the point of view of rule-based AI (today the "old" AI), but also from the viewpoint of neural networks, which were enthusiastically discussed at the time. From a conference in Berne, which I attended together with Gabi in the early 1980s or perhaps even before, I still remember a presentation by Allen Newell, where he demonstrated one of the big expert systems of the time in which he was involved, I think it was MYCIN. If I remember correctly, Herbert Simon was there, too (perhaps also Marvin Minsky, but my memory might be deceiving me). Later in the 1980s I attended a lecture by David Rumelhart in Paul Feyerabend's colloquium at the ETH Zurich, where he described his backpropagation algorithm for neural networks.

**Habilitation.** From October 1983 onwards, I had an assistantship with Mittelstraß19 with much time for research. As my professional duty apart from teaching, I worked on his Encyclopedia. It was also expected that within the time period of this assistantship — six years — I would do my Habilitation, as normally required for the position of a professor in Germany. In the years after my doctorate, I wrote quite a number of articles on the topics mentioned above (Section 6), but no single coherent piece of work as demanded by the Philosophical Faculty as a Habilitationsschrift. In 1987 I decided to put all non-historic logic materials together, essentially around the theme of higher-level rules, which was already the topic of my doctoral thesis, but now extended in various directions. I included a chapter on relating my framework to the sequent calculus rather than natural deduction, to the bunch-based system of relevant logic (see below, Section 7), to logic programming and also to the framework of Martin-Löf type theory. Putting my family under enormous stress — Gabi was working as a postdoc in Zurich and our second child was born in March 1987 — I wrote it in a couple of months and submitted it in late summer, shortly before we left for a stay in Sweden from September 1987 until January 1988. The thesis (A1987c) was relatively short, but satisfed the requirements.20 Prawitz was one of the reviewers, and I was later told that Prestel, who had attended the faculty meeting as an external expert, argued: "If Prawitz says yes, the faculty can't say no". In the obligatory colloquium with the Philosophical Faculty I spoke about "What is probability?" — the topic had to be diferent from that of the thesis and was selected from three themes submitted by the candidate. I talked about probability in a very general way to the humanities scholars in the audience. Because musicology was not represented as a subject at Konstanz, I even dared to say something about the Tristan chord and its relative frequency in music history, mentioning its occurrence in Beethoven's op. 31,3 piano sonata which I had learned about in Bonn with Martin Vogel: something that would have been a no-go for me if musicology professors had been there.

There was a formal inaugural lecture shortly afterwards. I still have the handwritten slides for it. I talked about logic and its future with quite some emphasis on issues of computer science and of artifcial intelligence I had become interested in (including

<sup>19</sup> Assistant professor without tenure would be the closest American analogue.

<sup>20</sup> It circulated as a manuscript and was later made available on my website. A publication as a book in the Bibliopolis series "Studies in Proof Theory" was recommended by Dag Prawitz, but did not materialize.

nonmonotonic reasoning, polymorphic typing and hardware verifcation). So my general intellectual attitudes towards the end of the 1980s, when I moved to Tübingen, was a compound of proof-theoretic semantics, foundations of computer science, logic programming, general psychology and cognitive science.

### **7 Konstanz 1981–1989: Start of long-term collaborations and long-term friendships**

At the time of my Ph.D. and shortly after I made scientifc contacts and established personal friendships that lasted for the whole of my career and shaped my research topics.

**Lars Hallnäs, defnitional refection and extensions of logic programming.** Extending logic programming by means of implications in the bodies of clauses was a natural idea which ftted well with my approach of higher-level rules developed in my thesis. However, Lars Hallnäs had an idea which went much further, namely using the schema for general elimination rules as a schema for the inversion of systems of arbitrary, not necessarily logical rules, in particular rules of a logic program. He called it the principle of *defnitional refection* as one refects on the given defnition as a whole in order to invert it, the approach itself being called "defnitional reasoning". Hallnäs's idea was to incorporate it into logic programming by means of a logic programming language which allowed one to evaluate goals according to this rule. In the 1990s, with his collaborators at the Swedish Institute of Computer Science (SICS), which was very much involved in implementing PROLOG, he developed such a system. We published the idea of proof-theoretic extensions of logic programming including defnitional refection in a two-part article at the end of the 1980s (A1990a, A1990b). From that the idea emerged to set up a series of conferences on *Extensions of Logic Programming (ELP)*. This worked out very well. We initiated and partly organized fve conferences with corresponding proceedings: ELP1989 in Tübingen (C1991b), ELP1991 in Stockholm (C1992), ELP1992 in Bologna (C1993), ELP1993 in St. Andrews (C1994) and ELP1996 in Leipzig (C1996).

From the beginning of the 1990s Hallnäs and I have put more emphasis on defnitional reasoning as a foundational approach that goes way beyond logic programming and is a general reasoning principle, as originally intended by Hallnäs (1991, 2006) (see my A1993, A1994b). Since it can be applied to any system of defnitional rules, thus also to the rule defning in terms of *not-*, it has applications to paradoxical reasoning, and to any kind of non-wellfounded defnition. This is why Hallnäs spoke of "partial" inductive defnitions. This had implications for my later work on paradoxes. Hallnäs had done his Ph.D. with Prawitz in 1983 on normalization in set theory (Hallnäs, 1983), in which he had been strongly involved in the proof-theoretic treatment of paradoxes. Our work on defnitional refection quickly grew into a friendship between us and our families, with many short and long visits to each other's homes. Lars is

also a composer by training21, and his present for my 60th birthday was a composition for organ.

Unfortunately, we were not very successful with these ideas about defnitional refection. This is partly due to the fact that there is an obsession in philosophical proof-theoretic semantics to deal with logical constants, which are particularly wellbehaved. Many do not realize that an inductive defnition is like a set of introduction rules, a fact well established in mathematical proof theory. Dale Miller and his group were essentially the only ones beyond groups in Sweden to appreciate this approach. Had there been more resonance, we would have pursued it further into more advanced directions including defnitional refection for defnitions of functions and functionals. There is some work by Hallnäs in this direction (see his contribution to this volume, Hallnäs 2023), which presents a good starting point for such investigations.22

**Kosta Došen and logical constants.** I met Kosta Došen at the logic colloquium in Florence in 1982. He had recently fnished a D.Phil. thesis on logical constants with Michael Dummett and Dana Scott at Oxford, where he developed a theory of logicality based on an idea of iterated sequents (today one would speak of "hypersequents") and rules which could be read both downwards and upwards and which he called double-line rules. Later, through his publication on logical constants as "punctuation marks" (Došen, 1989), this approach became quite popular and widely discussed. I approached him personally in Florence because I was attracted by the abstract of his talk, where I saw immediate similarities to my idea of rules of higher levels. There was a certain diference, as his higher-level sequents essentially served for the interpretation of modal connectives whereas mine served for the interpretation of implication, for which he did not need any higher-order entitities. When we met, we got on well together and stayed friends for life. In Konstanz I worked with him on the notion of conservativeness and uniqueness of logical operators. In our joint papers A1985 and A1988 written in Konstanz in the mid-1980s we showed the duality of these notions and broke down the uniqueness problem to the interaction of two consequence relations. Later we initiated the topic of substructural logic (see below, Section 8). At the beginning of the 1990s he and his wife stayed with us in Tübingen for some time as he could not return to Belgrade during the Yugoslav war. In the middle of the 1990s, Kosta Došen turned towards category theory, which I did not fnd so interesting at the time. He became one of the leading fgures in categorial proof theory, building very much on the work of Joachim Lambek. This was roughly at the time when he was giving up a full professorship in Toulouse in 1998, which he had held since 1994 after a two-year stay in Montpellier, that is, from the breakup of Yugoslavia onwards, to take up a professorship in Belgrade, after his arrival spending nights in bomb shelters during the NATO air raids of 1999. Later on, towards the 2010s, our contacts became more frequent once again, as we realized that his categorial approach to logic and my idea of the primacy of the hypothetical over the categorical converged. Sadly, in spite of an innovative cancer treatment in

<sup>21</sup> For his musical biography see https://quatuorbozzini.ca/en/artiste/hallnas\_la.

<sup>22</sup> Cp. our notes presented to Dag Prawitz on his 80th birthday (M2016).

Germany the year before, he died in 2017, aged only 63. I delivered a eulogy at his funeral (M2022c).

**Neil Tennant and the Scottish connection.** I frst got to know Neil Tennant through a paper in the *Journal of Philosophical Logic* (Tennant, 1980), in which, among other issues, he claimed to have given a proof of normalization for full classical logic, something which was correctly proved by Gunnar Stålmarck (1991)23 some time later. I pointed Tennant to the defciencies in his proof, and he invited me to a short stay at Stirling (Scotland) in 1982, where he had just started an appointment as a professor. It was there that I gave my frst talk abroad in English. All presentations I had given before had been in German — times were diferent from today, where English has become the standard language of communication even in my Tübingen department. A follow-up stay — again with Gabi — took place in Winter/Spring 1983 when I was fellow of the Institute for Advanced Studies in the Humanities in Edinburgh, which was very productive.

Through Neil Tennant we met Stephen Read, and through him Roy Dyckhof. This lead to an enjoyable and productive several-month stay in St. Andrews in spring 1985. Through Read I became acquainted with "standard" relevance logic — Tennant's deviating approach, from which his later *Core Logic* (Tennant, 2017) evolved, was already known to me. Particularly interesting was the proof-theoretic approach distinguishing diferent conjunctions at the structural level, something that Read and John Slaney had developed relying on earlier work by Michael Dunn. It was based on so-called "bunches" as a specifc sort of structural entities and made it easy to formulate general rules for certain intensional connectives. I was quite enthusiastic about this topic and included it later in my Habilitation thesis (see above, Section 6).

Roy Dyckhof, who had come from category theory and was now working in logic in computer science, I met originally as an attendee of the philosophical seminars in St. Andrews. As his work was so closely related to mine we collaborated intensively over the years. He was also somebody working on a contraction-free calculus for intuitionistic propositional logic independent and in parallel to the results in the Ph.D. thesis of Jörg Hudelmaier, who was a member of my group in Tübingen in the 1990s. Roy Dyckhof was the frst to propose general elimination rules without higher levels ('fat' general elimination rules, see below, Section 10). He died in 2018, aged 70. At his memorial service at St. Salvator's Chapel of the University of St. Andrews the bells were rung, for which he himself, as a passionate bell-ringer and theoretician of bell-ringing (see Dyckhof, 2018), had raised funds.

#### **8 Tübingen 1989–1997: Logic, philosophy, computer science**

**Application and appointment.** After unsuccessful applications elsewhere, I became professor for logic and philosophy of language at the University of Tübingen in 1989. The support of Franz Guenthner, who was professor of general and computational

<sup>23</sup> I was an examiner of Stålmarck's Stockholm M.A. thesis which resulted in this article.

linguistics in Tübingen, was decisive both for the installation of this professorship and for my appointment. He was the editor with Dov Gabbay of the *Handbook of Philosophical Logic* and had frst taken notice of my work through Göran Sundholm's *Handbook* conribution on proofs and meaning (Sundholm, 1986). As mentioned above in Section 5, Göran had been a reviewer for the *JSL* of my thesis publication. My application lecture was on "Logic and Cognition" — I tried to give an impression of my inclination towards cognitive science going beyond logic in the narrower sense.

**Institutional struggles, ofer from Berlin.** In the fall of 1989 I started to teach in Tübingen. This was a wild time politically. I remember seeing the pictures of the collapse of the Berlin wall on TV during a stay in Tübingen. We were extremely lucky to buy a house in Tübingen at a very reasonable price in January 1990 that we rebuilt and refurbished, so that we could move into it at the end of the summer of 1990, just in time for school and the academic year. For me the frst years in Tübingen were accompanied by certain struggles due to institutional issues between philosophy, computer science and linguistics. With the backing of the computer algebraist Rüdiger Loos, this was fnally solved for me by moving into the newly founded department of computer science, with philosophy as a second afliation, so that I had a joint appointment in computer science and philosophy, which for me and my range of interests was ideal. In 1991 I took my ofce in the computer science building, a former rehabilitation hospital with a beautiful view over the hills nearby and have stayed there until today.

Shortly after starting the Tübingen position I received an ofer of a professorship in logic at the Free University Berlin. David Pearce, who at that time was working at Berlin, did all he could to get me there even though he was a strong candidate for the position himself. However, in the end I decided to stay in Tübingen where I had just started and our family had just settled.

**Substructural logics.** Apart from "proof-theoretic semantics" I am happy to have been involved in the coining of another term: "substructural logic" or "substructural logics". This term made it into the Mathematics Subject Classifcation, its current codes being 03B47 and 03F52 (American Mathematical Society, 2020). The situation unfolded as follows. I had just started my professorship in Tübingen in autumn 1989, while Kosta Došen had received a Humboldt grant for 1990–1991 and was staying in Konstanz. Franz Guenthner was in the possession of funds that he ofered to me to spend on promising conferences. The frst one in December 1989 was on *Extensions of Logic Programming* — see above (Section 7); the second one in October 1990 was organized together with Kosta Došen and originally entitled *Logics with Restricted Structural Rules*. We managed to invite top, or better the top, people in the respective felds. Our speakers were J. Lambek, S. V. Soloviev and J. van Benthem for the Lambek calculus, R. K. Meyer and J. M. Dunn for relevant logic, J.-Y. Girard, G. Sambin and A. Scedrov for linear logic, and V. Grishin and H. Ono for BCK logic. One or two days before the conference started Kosta rang me up: "*Logics with Restricted Structural Rules* is too clumsy, we must fnd a term that catches. I propose *Substructural Logics*. According to the Oxford English Dictionary it means 'relating to a substructure', that is, to the foundation of something." Initially, I was

opposed to this, as it sounded like "subculture" to me, but eventually I gave in. The volume that resulted from the conference then bore this title (C1993). While the title of the book worked well, publication came with some unexpected problems. We had proofread everything except the title page and cover, and when the book came out, the editors appeared in non-alphabetical order: Schroeder-Heister & Došen. Because it was me who corresponded with Oxford University Press, they had assumed I was designated frst editor and changed our specifed author order! Cover disaster aside, the project was a great success. A vast amount of literature on "substructural" logics has appeared in the meantime, including two textbooks.

**Logic and cognition.** Even though the formal denomination of the professorship in Tübingen was "Logic and Philosophy of Language", and even though I was a full-fedged logician, my interests had shifted in the time after my Ph.D. to many other subjects, and in particular to the cognitive realm. There had been various infuences on me, and my involvement in cognitive psychology through Gabi certainly played a major role. Most interesting to me was that people started to see connections between logical or symbolic reasoning in the narrower sense and more general reasoning models developed in the cognitive sciences. Moreover, my position was connected to the "Institute for Natural Language Systems"24 founded by Franz Guenthner, with its associated curriculum and degree in General Linguistics with Psychology and Computer Science. This was a four-year M.A. course in cognitive science with emphasis on (both general and computational) linguistics, with computer science and psychology as obligatory minors. It had absolutely brilliant (individually selected) students from diferent backgrounds who had all sorts of interests besides their main felds of study. I remember, for example, a very stimulating talk by Joseph Weizenbaum whom they invited to discuss the social impact of recent developments in information technology and artifcial intelligence. They were a pleasure to teach an experience that I enjoyed already two years prior, as a substitute professor.

This suited me extremely well, and the titles of the courses I initially taught in Tübingen were all, in a sense, related to the cognitive feld. Almost immediately after my start in Tübingen, in the autumn of 1990, I led seminars on connectionist modelling. My philosophy colleague Rüdiger Bubner complained when I made an announcement using the title "PDP", which sounded to him like the name of a political party in the GDR (it ceased to exist shortly afterwards in October 1990). He was right, of course, so I named it "Connectionism" and discussed the topic using its full name: "Parallel Distributed Processing", the title of the collection by David Rumelhart and James McClelland (1988) that was being widely debated at the time. I also taught philosophical foundations of cognitive science, philosophical critiques of AI, causality, connectionist reasoning systems and the like; many issues that would ft very well into the current landscape, where AI is the big thing, including in philosophical discussions.

A great success was a four-year research grant on the cognitive side of deduction as part of a programme called "Cognition and the Brain" by the German national science foundation (DFG) at the beginning of the 1990s together with a simultaneous grant

<sup>24</sup> "Seminar für natürlichsprachliche Systeme"

on efcient Gentzen systems within the DFG programme "Deduction". I was able to hire as postdocs Venkat Ajjanagadde, who had just fnished a much quoted paper with L. Shastri on the connectionist representation of rules (later published as Shastri & Ajjanagadde 1993) and Seppo Keronen, who had done his Ph.D. with R. Stanton on "Computational Natural Deduction" (D1991); and, as a doctoral student, Uwe Oestermeier, who fnished with a thesis on pictorial and logical thinking, which won a dissertation prize at the university (D1993).

What I was interested in, with respect to cognitive science, was the establishment of a proper combination of symbolic and connectionist modelling of logical reasoning, something that would be at the cutting edge of science nowadays. Had I pursued that further, my research could play a role in today's discussions, but this would have been at the expense of proof-theoretic semantics.

My interests in connectionism and AI were essentially from the philosophical side. At the same time I did a lot of teaching in theoretical computer science, as in the beginning of the 1990s the chair devoted to that feld was not yet flled. I taught both advanced topics such as denotational semantics of programming languages, but also elementary courses on formal languages and computatability with large audiences. This was very much in accordance with my interests in logic programming as well as in type-theoretic approaches and thus advanced functional programming.

At the beginning of the 1990s I also had Jörg Hudelmaier working with me, who had done excellent work on the complexity of intuitionistic theorem proving, and with whom I published a paper on the non-commutative Lambek calculus (A1995) demonstrating certain limits for cut elimination. Ernst Zimmermann, whose Ph.D. thesis of 1995 on the philosophy of modal logic (D1995) I co-examined, has been present in my group for the past 20 years.

In Tübingen I had two collegues, on whom I relied heavily. On the philosophy side this was Walter Hoering, who was the logic professor in philosophy until he retired in 1998 (M1998a, M2019a) and on the informatics side this was Rüdiger Loos, who was instrumental for my afliation to computer science (see above) and without whom computer science would not exist in Tübingen in its current, independent, and strong form.

Large public lecture series on "Music in the Sciences" and "Music and Informatics" (organized together with Loos), on "Thinking and Calculating", and a full public lecture series on the work of Karl Popper immediately after his death in 1995 (delivered with the philosopher of science Herbert Keuth), which were all very well received, serve to show that at the time, proof-theoretic semantics was not yet my key occupation.

#### **9 London 1997–2000: Back to my roots**

**Sabbaticals and research stays.** Longer stays at other institutes and with other colleagues, especially abroad, are normally highlights of a researcher's career. I only mention the longest one, in London, from 1997 to 2000 (others took place in

Edinburgh, Stockholm, Berne, Paris and Oxford). Initially, I had only a sabbatical in the winter term 1997–1998, for which I had an invitation from Dov Gabbay to Imperial College London. We went to England as a family and rented a house in Kingston-upon-Thames, not far from the German School. We extended our stay, and in the end remained there until summer 2000, that is, for a total of three years. From the summer semester 1998 on, I commuted between Tübingen and London, fying to Tübingen for three days a week during the semester and spending the university vacations in London. This was possible at the time without impairing my duties in Tübingen. I still conducted the full teaching programme of nine hours per week of classroom teaching, split between philosophy and computer science. Besides Imperial College, I was, through Edmund Robinson, also afliated with Queen Mary and Westfeld College (QMW, now Queen Mary, University of London). While working there in computer science, I met David Pym with whom I still collaborate today. On a personal level, staying in London was one of the best decisions we made in our lives, in particular as far as our children were concerned. It was a great experience that shaped their future. They both went on to study and live in England.

**Logic in philosophy.** In a sense the time in London marks my returning from a dominant interest in cognitive science (still with an emphasis on logic) which I had retained for more than a decade, to logic and in particular philosophical logic and philosophy of logic. These were my roots when I started studying in Bonn and also my occupation in my doctorate. Even the computer science aspects of my work I now started seeing from the perspective of philosophical (rather than mathematical) logic. For example, the inversion rules in extended logic programming languages could be seen as philosophically motivated inversion principles in a proof-theoretic semantics of logical and other signs. This did not mean that I gave up cognitive science or computer science, but rather considered it as a background and application for more philosophical theories. What contributed to it was the joint Tübingen-Konstanz research group *Logic in Philosophy* which I initiated, and which I applied for together with Wolfgang Spohn (who had just taken up a professorship in Konstanz) in 1996 with many brilliant philosophical projects both in Tübingen and Konstanz (C2005). The project started exactly when we went to London, so the Tübingen part was partly directed from London. Through this I collaborated with Patrizio Contu and Reinhard Kahle. With Patrizio I published my frst paper on hypothetical versus categorical reasoning (A2005), with its philosophical claim that the hypothetical concept of consequence should be given conceptual priority over the categorical concepts of truth or provability.

**The Salzburg ofer.** In the year 2000 the University of Salzburg ofered me a chair in philosophy — the successorship to Paul Weingartner, who had just retired. This was quite unexpected. I almost decided to accept it; we even registered our children at local schools. However, it fnally failed for administrative reasons, which had essentially to do with the German pension system and the losses I would have incured when moving to Austria, even though it was within the European Union. Turning it down was a hard decision as there was only little that could be gained in Tübingen; for example, the fact that my position in Tübingen was a second-class professorship, not

a chair, was non-negotiable. Even though I declined it, the Salzburg ofer contributed to my intellectual shift back towards philosophy.

#### **10 Tübingen 2000–2019: Topics in proof-theoretic semantics**

During my time in Tübingen, and in particular after the stay in London, which fxed, so to speak, the topic of proof-theoretic semantics as the key topic of my group, I have worked on various aspects of proof-theoretic semantics. There was no linear order. I tried to identify critical topics and fnd solutions to certain problems, though many of them are still open (A2016a). I shall just mention a few of them.

**Defnitional refection, paradoxes and intensional proof-theoretic semantics.** I have already mentioned Lars Hallnäs's idea of defnitional refection and the idea of defnitional reasoning, which my mind always kept returning to. It is clear that defnitional reasoning constitutes a kind of intensional proof-theoretic semantics, as defnitions are intensional entities. It depends on the sort of defnitions whether the system obtained is well-behaved. Hallnäs called this well-behaviour, where we have cut elimination and the like "total". Changing the defnition, that is, changing the meaning of terms, changed the global behaviour of the system, so there was a strong sense of non-locality. The standard example was always the defnition of by *not-p*, which should be allowed (as in logic programming), but which globally leads to systems without normalization or cut elimination and which can be seen as the skeleton structure of the various logical, semantical and mathematical paradoxes. One follow-up question is then under which conditions defnitions are "well-behaved" and are in this sense extensional entities. In several papers I showed that this strongly depends on the structural rules available, whereby these structural rules interact (A1992a, A2016b), including the seemingly trivial structural identity axiom (" entails "). I could also show that,in order to avoid paradoxes, a very limited restriction of the rule of contraction is sufcient, in contradistinction to abandoning this rule altogether (A2012b), based on an intensional distinction of the ways a formula is 'given' to us — either through a defnition or without any specifcation (A2022b). I even speculated on the idea of a 'free' type theory, in which the application of rules depended on the evaluation of certain defnitional terms (A2012d).

A purely intensional paradox was discovered by Jan Ekman in a thesis he wrote supervised by Hallnäs (Ekman, 1994). By translating set-theoretic paradoxes into propositional logic he could show that under certain assumptions derivations in propositional logic cannot be normalized. This has fascinated me ever since its appearance, and I often mentioned it in talks. I knew Ekman from the time he was still a doctoral student, and would have liked to employ him on a post in the research group *Logic in Philosophy*. However, he had already accepted a permanent position in industry at that point. Only in recent years did I manage to write two papers on the subject, together with Luca Tranchini (A2017, A2021). By that time Luca had embarked on a diferent notion of intensional proof-theoretic semantics, which is concerned with the identity of proofs, and an objection we made was that from this point of view Ekman's translation has certain defciencies. This is closely related to the issue of harmony, and Luca is pursuing this strand of research.

In fact, I came across the issue of identity of proofs when I was external examiner ("opponent") at the doctoral defence of Prawitz's student Filip Widebäck in 2001, who had written on exactly that topic (D2001). In preparing my talk I had noticed that Widebäck presented results that had been obtained independently by Kosta Došen at the same time. Only much later I realized the signifcance of these issues for the sort of proof-theoretic semantics I was advocating. Incidentally, a thesis defence in Sweden, with an external opponent who frst presents the thesis (to a potentially large audience), and then publicly questions the candidate about it, is a great experience in its own right, even if it costs the opponent a week of preparation as he needs to read the thesis really carefully. Before I had acted as opponent in Sweden once already, for Lars-Henrik Eriksson in computer science, who had worked on extensions of defnitional refection (D1993).

**Direct negation, Béziau's square, and bilateralism.** In the still unpublished part of my doctoral dissertation, I had already discussed an extended natural deduction system with a 'structural' kind of negation — structural, because it is built into the deduction machinery and not a logical constant. Such negations, also called "denial", sometimes "strong negation", have been considered for a long time, not only in philosophy but also in computer science. I discovered that the idea of defnitional refection gives rise to a slightly diferent sort of denial, as it allows a distinction between a direct denial based on defnitional clauses, and an indirect denial induced by defnitional refection. The latter resembles, in a rough way, what in logic programming is called negation by failure. Correspondingly, we obtain two sorts of assertion, namely direct assertion (by defnition) and indirect assertion (by failure). I presented some initial ideas at the frst Uni-Log (Universal Logic) conference organized by Jean-Yves Béziau in Montreux in 2005. This was an excellent meeting with, for example, Saul Kripke present (as a so-called "secret speaker", who was not announced beforehand).

It turned out that this idea with two sorts of assertions and two sorts of denials could be put into the structure of a square, so I presented it in 2007 at the frst conference on the square of opposition by Béziau, again in Montreux. Initially it seemed to me to be a crazy idea to devote a conference to this topic, and later a whole series of conferences, but it turned out that there were sufcient serious ideas in it to compensate for the stranger aspects. The conference proceedings only appeared in 2012. I still consider my contribution there (A2012a) signifcant, although it had no impact whatsoever. When I was invited for the third congress on the square in 2012 in Beirut, I presented an even further developed variant of my approach, calling it a "calculus of squares". I arranged the two assertions (or positions) and the two denials (or negations) at the corners of a square, proposed a sequent calculus for such entities and discussed major theorems (cut elimination etc.) for such a system. The conference was an impressive event at the beautifully located American University. People in Beirut tried to live their lives normally despite the Syrian war having started the year before. To us as visitors at least, it was not much felt there yet. I have not continued this

line of research since, partly because of lack of resonance, partly because I have been engaged in other issues of proof-theoretic semantics. I think it has a great potential. It would ft very well into today's discussion of what is called "bilateralism".

Béziau also invited me to give a course on proof-theoretic semantics in 2007 at the second Uni-Log conference in Xi'an (China), which took place after the LMPS congress in Beijing. Even though, or perhaps because I essentially talked to colleagues rather than to students, it was extremely instructive to me. I later gave a related course at the ESSLLI conference in Bordeaux in 2009, this time to students (A2009a). I had already given a course with Lars Hallnäs at ESSLLI 1993 in Lisbon and have always considered the ESSLLI summer schools a great success story. Thomas Piecha and myself also organized a workshop at the sixth Uni-Log in Vichy in 2018 (C2018b). Jean-Yves Béziau, through marketing projects in a way one is not normally used to in logic, contributes to keeping the machinery of research interaction running. I myself have strongly profted from this, as I would probably not have developed certain ideas otherwise. The fact that we now have a *World Logic Day* (14 January) approved by UNESCO and celebrated even during the Covid pandemic as a massive collection of online events, is also the result of Béziau's initiative.

**Assumption-conclusion bilateralism, hypothetical reasoning and the dogma of standard semantics.** As far as bilateralism is concerned, I have always understood it in the sense that reasoning should not only be one-sided, from unspecifc assumptions towards a specifc conclusion, but two-sided in the sense that in the course of reasoning, both the assumptions and the conclusion can be modifed according to semantic rules. This is in certain ways related to negation-bilateralism, but rests on diferent intuitions, as no negation is involved in the frst place. I have argued in favour of this sort of bilateralism25 in the context of criticising standard semantics. By standard semantics I mean both model-theoretic semantics and most intuitionistic semantics such as BHK, realizability, validity-based semantics in the sense of Prawitz and Dummett etc. All these semantics start using a categorical concept such as truth or validity. Then a hypothetical concept of consequence is defned as the transmission of this categorical concept under all circumstances. Assumptions are nothing but placeholders for categorical entities (truths or valid proofs) — the specifc semantics only applies to conclusions. This motivates the emphasis on introduction rules in standard proof-theoretic semantics. I called this a *dogma*, as there is no real reason for choosing this sort of approach. Why not choose a hypothetical concept of consequence as a basis and then derive the categorical from the hypothetical rather than vice versa?

There is a formal model of such an approach in the form of the sequent calculus, where one can modify both the left and right side by independent rules. There is also categorial proof theory which is essentially based on such an approch, as one deals

<sup>25</sup> I spoke of "bidirectionality" instead of "bilateralism". Perhaps this is a preferable term to avoid terminological confusions.

with hypothetical entities (arrows) from the beginning.26 However, this has never been developed into a philosophical foundational theory.

I developed this idea together with Patrizio Contu in our research group "Logic in Philosophy" in the late 1990, who very strongly insisted on it (A2005), and have 'preached' it ever since at many conferences and many occasions, with limited success. Certainly, my arguing was more programme than execution, but the target was clear: developing a semantics of hypothetical judgements as primordial entities, that is, whithout presupposing categorical judgements. I remember presenting it at the conference of the Society for Analytical Philosophy in Bielefeld in 2004 (A2004), the Logica conferences in Hejnice in 2007 and 2008 (A2008b, A2009b), at a conference at the Swedish Collegium for Advanced Study (SCAS) in Uppsala in 2010 (A2012e), where a great deal of the proof-theoretic semantics community was present, on several occasions in Kosta Došen's colloquium in Belgrade, as well as in a manuscript for a special issue of *Erkenntnis* that never saw publication (M2008c). Perhaps there will at some point be a student prepared to work out this concept in detail, both on the classical and intuitionistic side.

**Harmony.** Harmony — a term introduced in this context by Dummett (1973, pp. 396f.) — is considered one of the fundamental notions of proof-theoretic semantics. The conditions for asserting a sentence should match the conclusions that can be drawn from it. What it should mean in detail is still a matter of discussion. In my thesis I tried to achieve harmony by generalized elimination inferences for logical constants which guarantee it due to their syntactic form. To enable this I had to introduce higher-level rules, as otherwise implicational connectives could not be interpreted. They are also a crucial tool to interpret certain connectives, for example, of relevance logic such as fusion (A1987c). Now there is also another variant of generalized elimination rules put forward by Dyckhof, Edgar López-Escobar, Tennant and Jan von Plato, which are of an elementary level (called "parallelized" by Tennant) and are a natural deduction translation of Gentzen's sequent calculus rules (see my A2014b, and above, Section 7). Their adherents have always preferred these 'fat' rules and even claimed that they provide a framework as powerful as mine to generate harmonious rules. It bothered me for many years that I did not have an easy argument at hand that certain constants could not be expressed using fat generalized elimination rules. This changed when I met Grigory Olkhovikov. He was a student of Grigori Mints, and Mints recommended him to me at a conference in Bochum in 2012. When I posed the problem to him — translated into a problem in intuitionistic propositional logic — he came up with a solution, which resulted in a joint paper in the *Review of Symbolic Logic* (A2014a, with an extension to arbitrary levels in A2014b). This settled the issue, although incorrect claims continue to be propagated.

However, this was still a limited notion of harmony, as it only laid down appropriate elimination rules given certain introduction rules (something which, by the way, also could be inverted; A2015c). A more universal notion would tell when a given pair

<sup>26</sup> Note that here the term "categorial" refers to mathematical category theory, whereas "categorical" is used in the traditional philosophical sense of denoting a categorical in contradistinction to a hypothetical judgement.

of introduction and elimination rules is in harmony. I made such a proposal in my contribution to the "Outstanding Contributions to Logic" volume in honour of Dag Prawitz (A2015a) and a subsequent paper in *Studia Logica* (A2014d), where I gave second-order translations of the introduction and elimination meanings of connectives and declared them to be in harmony when these translations were equivalent. This was a conceptual achievement, I think, even though it is 'only' an extensional notion of harmony based on deductive equivalence. Luca Tranchini is developing this approach further in the direction of an 'intensional' notion taking proof identity into account. (I made some remarks on this issue in A2016a.) He elaborated on this idea partly through interaction with Kosta Došen and his work, with whom we were quite intensively collaborating in the 2010s, and also worked with Paolo Pistone, who had done his Ph.D. with Girard and was in Tübingen as a postdoc in 2018–19.27

**Incompleteness and atomic systems: Beyond (intuitionistic) logic.** As mentioned above (Section 6), the idea that proof-theoretic semantics could in a sense justify intuitionistic logic, in that the latter is complete with respect to the former, has fascinated me ever since my Ph.D. thesis. In the past ffteen years, Thomas Piecha and myself have essentially come to a negative conclusion. This was considerably stimulated by exchanges (including mutual visits) with Wagner de Campos Sanz and also with Tor Sandqvist, both of whom had the idea that certain only classically valid formulas are verifed by proof-theoretic semantics. Several papers with de Campos Sanz and Piecha (A2014, A2015) resulted from that, and in the end Piecha and I found a counterexample free from defciencies, depending only on a few plausible assumptions about the proof-theoretic semantics used (A2019c). This is, of course, a somewhat negative result, and not what I had expected 35 years ago, but perhaps it is a prejudice to think that intuitionistic logic is *the* proper logic of constructive or operational reasoning. As a desideratum there remains, of course, to describe this proper logic, if there is (a fnitely axiomatizable) one.

Closely interconnected with the completeness problem is the question of the atomic base of proof-theoretic semantics. In model-theoretic semantics it is structures that tell us which atomic sentences are true and which are false. In the proof-theoretic case one considers atomic deduction systems instead. However, what kind of such systems are admitted strongly infuences the logic one obtains, in particular if in atomic systems not only production rules but more general rule concepts including defnitional refection are allowed, as joint work with Piecha shows (A2016b, A2017). Quite independent of any logic built on top of atomic systems, these systems in themselves represent a highly interesting topic that goes way beyond logic and impacts, for example, the theory of inductive defnitions as well as rule based argumentation theory. Given the current interest in the latter, this is perhaps a future research feld to invest in.

**The format of deduction.** One topic that interacts with almost all others is which form a logical deduction should take. Proof-theoretic semantics has a bias towards natural deduction, but the sequent calculus is another possible format which fts

<sup>27</sup> Cp. their joint contribution to this volume (Pistone and Tranchini, 2023).

very well in particular with approaches that want to make assumption-conclusion bilateralism explicit and allows one, for example, to establish a perfect duality between conjunction and disjunction, something on which Došen's categorial approach and Giovanni Sambin et al.'s Basic Logic (Sambin, Battilotti, and Faggian, 2000) rests. Defnitional refection works under both formats, and I have argued it has certain advantages over Sambin et al.'s approach (A2013). However, as soon as it comes to implication, not even the standard rules of the sequent calculus are exempt from criticism. I have claimed that for the introduction of implication on the left side (the assumption side) diferent rules might be considered, if one wants to distinguish between implications as rules and implications as links (A2011b, A2014b)28 and thus to disentangle logical from structural features of implication. This also plays a role if one wants to give the approach of higher-level inferences rules in my doctoral thesis a sequent-style formulation, as Arnon Avron has pointed out (Avron, 1990 — incidentally the only paper referring to me in its title). In addition, there is the even more fundamental question of the role of proof search, that is, the idea that in reasoning we often start with a goal that we want to prove, and then *reduce* it to subgoals. Perhaps "reductive" reasoning in this sense should be given a more fundamental stance, something which is the case, for example, in dialogue logics. We have been involved in the latter through research grants (see Section 11), and Thomas Piecha and myself have worked on it (A2012, A2015a) — in fact, Piecha wrote his doctoral thesis on it (D2012), which represented a major advance for the feld. However, I think the general problem of deduction versus reduction still needs to be settled, possibly in a much wider framework.

#### **11 Group, grants, cooperations**

In the sciences it is absolutely standard to work with a group of people with whom one can discuss problems and results, and to publish with them. In the humanities, there is still a considerable number of single researchers who reach (often outstanding) results without much interaction. I defnitely wanted to have at least a small group of, say, three people to talk to. I had some money from the university to pay for one position, which I secured when I turned down the Salzburg ofer. For all the rest grants were needed, both doctoral and postdoc grants.

I mentioned already the grants in cognitive science and deduction as well as the research group "Logic in Philosophy" during my frst decade at Tübingen and London (Sections 8 and 9). Successful grants after 2000 were a grant from the European Science Foundation (ESF) on "Dialogical Foundations of Semantics" (2008–2012) for a joint project with groups in Lisbon (Reinhard Kahle) and Amsterdam (Benedikt Löwe) as well as a 10-year long French-German collaboration on "Hypothetical Reasoning" (2009–2019) within a joint programme of the French and German national science foundations (ANR and DFG), with the Institut d'histoire et de philosophie

<sup>28</sup> Cp. Michael Arndt's (2023) contribution to this volume.

des sciences et des techniques (IHPST) in Paris. All this made it possible to hire as postdocs Kai Wehmeier, who was brilliant but left after one year to become a professor at the University of California, Irvine, and Bartosz Więckowski who in his thesis had created the topic of subatomic natural deduction (D2006) — now a growing branch of proof-theoretic semantics. Later, it allowed the recruitment of Rainer Lüdecke, who wrote a thesis on the game semantics of logic programming (D2012), Harald Maurer on connectionist modelling (D2014), Tiago Rezende de Castro Alves (D2019) on identity of proofs, Hermogenes Oliveira (D2019) on proof-theoretic validity and René Gazzari on the notion of occurrence (D2020).

In addition to providing money to employ researchers, the French-German project was a very successful research cooperation through which we made many new friends. It opened up entirely new perspectives, with many exchanges and research stays in Paris, and also a number of joint conferences. I knew Michel Bourdeau and Jean Fichot, the French project leaders, from the 2007 meeting on one hundred years of intuitionism at Cerisy castle (van Atten, Boldini, Bourdeau, and Heinzmann, 2008), where I gave a talk on Lorenzen's *Operative Logic* as an approach to proof-theoretic semantics (A2008a) and met Michael Dummett again after the 1999 Tübingen conference on proof-theoretic semantics (Section 12). As the attendence of the conference exceeded the number of available single rooms, Jean Fichot and I had to share a room, which fostered our friendship. Within our own project, Jean organized a meeting at the same place in 2017.

Another collaborator over the decades was Reinhard Kahle. He worked as a postdoc in our research group "Logic in Philosophy" and has collaborated with us ever since during his long time as a professor in Lisbon. In 2018 he was appointed to the newly created Carl Friedrich von Weizsäcker chair for Philosophy and History of Science in Tübingen. It is not a dedicated logic position, but, as Reinhard is a logician, logic will be represented there (my own professorship has been discontinued as a logic position).

Most important for shaping the feld of proof-theoretic semantics in Tübingen were the three members of my group who were constantly present during the last decade of my work there and whom I would name frst when speaking of "my group": Michael Arndt (thesis D2008), Thomas Piecha (thesis D2012) and Luca Tranchini (thesis D2010). I benefted enormously from my interactions with them. None of the papers we wrote together could have been written by me alone, and I am sure that many of my single-authored papers would be of poorer quality.

#### **12 The Tübingen conferences on proof-theoretic semantics**

We organized three major conferences in Tübingen on proof-theoretic semantics which initially served to establish the subject under this name and then to keep its status. The frst of these I organized in January 1999 at Tübingen castle together with Reinhard Kahle. We had quite prominent guests there including Dummett, Prawitz, Martin-Löf, Mints and William Tait. We had big problems obtaining a visa for Grigori

Mints, who was living in Stanford but had Russian citizenship. In the end, I made this public in the local daily newspaper. First thing in the morning on the day the paper appeared, I was rung up by Herta Däubler-Gmelin, our local Tübingen member of parliament who at the time was Minister for Justice in the federal government. She solved the problem by intervening through channels available to her, and Grigori was able to come.29

It took seven years for the proceedings to appear (C2006), partly due to me, as I needed quite some time for my own contribution on validity concepts in prooftheoretic semantics (A2006d). Originally I had wanted to write about defnitional refection as an alternative approach to standard proof-theoretic semantics. However, then I thought I should frst elaborate the original Prawitz approach to validity, which then took the space allotted to my contribution30. The efect was that many considered the paper to be describing my deeply rooted proper opinion, and Prawitz was full of praise for it as presenting some of his ideas in a congenial way. I am no longer certain in what direction the 'most appropriate' approach to proof-theoretic semantics should go, and Prawitz is not absolutely certain about it either (see his contribution to this volume, Prawitz 2023), but in any case this is a paper making accessible a certain conception of proof-theoretic semantics in a thorough way. The special issue of *Synthese* probably paved the way for the term "proof-theoretic semantics". At least, it was roughly from this time on that the term took of and entered a great number of publications.

The second Tübingen conference on proof-theoretic semantics (C2016a), in March 2013, coincided with my 60th birthday. In addition to the conference, where again the big fgures of the proof-theoretic semantics community were gathered, Reinhard Kahle and Thomas Piecha organized a birthday colloquium where David Pearce, Heinrich Herre, Walter Hoering, Gereon Wolters, Bartosz Więckowski, Ernst Zimmermann, Marie Duži, Pavel Materna and Gerhard Jäger spoke, all of them signifcant people in my intellectual development. This was a wonderful present. It also meant that we enjoyed two conference dinners, the second one being a birthday dinner hosted by me. The third Tübingen conference took place in spring 2019, one semester ahead of my retirement. It was the largest one with a lot of contributed papers and listening guests, refecting that the subject has reached a mature state (C2019d).

In 2015, initiated by Luiz Carlos Pereira, Thomas Piecha and I organized a conference on *General Proof Theory* to celebrate the 50th anniversary of Dag Prawitz's book on natural deduction. Došen, Martin-Löf, Pereira, Schwichtenberg and Wansing, among others and in addition to Prawitz himself, all spoke. This was later published as a special issue of *Studia Logica* (C2019a). In 2001, Pereira, who had done his doctorate with Dag Prawitz around the time that Dag was acting as external referee for my doctorate, organized a conference in Rio de Janeiro celebrating Dag Prawitz's work, so the Tübingen conference was a kind of continuation of it. At the conference in Rio I presented my historical research on Hertz and on the frst paper by Gentzen which was written in the spirit of Hertz's work (A2002b).

<sup>29</sup> As Grigori said at the dinner, it was the frst time in his life that he toasted a government minister.

<sup>30</sup> The more inclusive and much longer frst version of the paper is available as M2003b.

This conference made a lasting impression on me, not only because of the excellent presentations, but also because there was sufcient time both for socialising and for seeing the beautiful city. Besides the "standard" sightseeing places I remember well the wonderful reception by Oswaldo Chateaubriand in his house, and also the visit of Rocinha and the beautiful view we (Grigori Mints, Jan von Plato, Ernst Zimmermann and myself) had from the fat roof of a house up there.

#### **13 Service and honours**

My main service to the scientifc community was my involvement in the Division of Logic, Methodology and Philosophy of Science (DLMPS, from 2015 Division of Logic, Methodology and Philosophy of Science and Technology, DLMPST). This association organizes its namesake congress every four years, dating back to 1960. I have been to several of these congresses. At the Salzburg 1983 congress I presented my frst international conference paper (A1983a). For the 2011 congress, which was to take place in Nancy, the DLMPS executive committee (in particular Wilfrid Hodges, DLMPS president), supported by Gerhard Heinzmann (local organising chair) chose me as the general programme chair, which was of course a great honour. Preparing the congress was a lot of work — we could invite around 70 speakers, and accepted some 650 contributed papers (C2014, C2014–2015). The congress went quite well, in particular as the local organizers were so efcient. At the general assembly I was elected secretary general of DLMPS for the subsequent four years, and shortly afterwards I became in addition treasurer of the association, a job I kept for eight years. At the 2019 congress in Prague I retired from all DLMPST positions. All in all, this sort of activity was a great experience, but its more than ten years were enough to exhaust my administrative energy.

A great honour in my intellectual career was the doctorate *honoris causa* received from the University of Belgrade. The award ceremony took place in May 2016. It was very nice, in a beautiful Auditorium Maximum, with *Gaudeamus Igitur* being sung. The following weekend we celebrated Easter at Kosta Došen's home with some of his friends. For me it was the frst year with Easter celebrated twice, this one being the orthodox feast, which was very late that year.

#### **14 Retirement and outlook**

As a professor in Germany, one can postpone one's retirement by three years beyond the retirement age, but only if the university agrees. This was not the case for me, as they needed my salary money otherwise (my pension comes from a diferent source). I just managed to obtain a half-year extension and retired on 1 October 2019. In November, I gave my retirement lecture on "Logic yesterday – today – tomorrow", to which plenty of former colleagues and friends came. Coincidentally, that very day

the UNESCO General Assembly installed World Logic Day, something I was happy to announce. Shortly afterwards I left for a half-year research stay at the Swedish Collegium for Advanced Study (SCAS) in Uppsala, where I had intended to work with Dag Prawitz on a joint monograph. Initially it was wonderful. I had perfect working conditions, very interesting and interested colleagues from a variety of felds; Gabi and I were provided with an excellent fat where we could even accomodate our children as visitors. Unfortunately, in March the Covid-19 pandemic struck and at the urgent recommendation of the German foreign ofce, we broke of our stay at the beginning of April, driving 20 hours non-stop with a rental car back to Stuttgart. Now, in spring 2023, we have had three years of pandemic, a much wider scientifc interaction due to videoconferencing, and much time to think about what to do next (M2022d). There are very many loose ends in the topics mentioned above (Section 10), so plenty of work and also ideas to continue. There are also other topics to consider, such as the mathematical applications of Martin-Löf's ideas and results in the context of homotopy type theory (The Univalent Foundations Program, 2013). On the other hand, given the advances proof-theoretic semantics has already made, I doubt I can make much more than incremental progress.

Nevertheless, there are two book projects that I would like to fnish. The frst one is the monograph with Prawitz on general proof theory. We have been planning it since Prawitz brought up this issue when we met at a conference in Dubrovnik in May 2010 (the conference was on the philosophical nature of logical consequence). Acting as his co-author is a great honour for me, and I regret very much that progress has not been as fast as envisaged. The pandemic and other matters interrupted our work in Sweden. Completing it has frst priority for me. The second is a monograph on proof-theoretic semantics. This I give more time, even though I wanted to write it already four decades ago, as mentioned in the frst paragraph of this autobiography. In any case, it is to rely on the book with Prawitz. While Nissim Francez's monograph (2015) covers many topics including some applications in linguistic semantics, and certainly helps to establish the discipline, I still see so many open problems in proof-theoretic semantics, in particular on the conceptual side, such that for me there is no doubt that an advanced textbook covering both philosophical and technical aspects is a desideratum31. I very much hope that I will continue to have the intellectual strength and good health to pursue this goal. As far as the feld of proof-theoretic semantics is concerned, it is on the right track. Even though my group is not being continued in Tübingen, there are plenty of groups continuing its topics, the most outstanding one in Germany being Heinrich Wansing's in Bochum. I very much hope that in the long run proof-theoretic semantics, which originated from logic, will prove itself in a great number of applications outside logic.

**Acknowledgements** I would like to thank Thomas Piecha and Kai Wehmeier for having taken on the great burden of editing this volume, and, of course, the authors for their contributions. I am also

<sup>31</sup> Originally, my SEP entry on proof-theoretic semantics (E2012c) was considered a frst step in this direction, which was actually based on a much longer draft (available online: M2011c). Given the fast development of the feld and changes in my own attitudes towards it, my original conception requires further thought.

greatly indebted to my family, especially my wife Gabi and my daughter Paula, for their help with this chapter, and to Chris Jones for detailed comments on the manuscript.

#### **References**


#### **Publications by Peter Schroeder-Heister**

Most publications can be downloaded from Peter Schroeder-Heister's homepage.

#### **(A) Logic-related publications including signifcant abstracts**


*Switzerland, 1–3 June 2007*. doi: 10 . 15496/ publikation - 72545. Full paper published as Schroeder-Heister, 2012a.


#### **(P) Psychological publications**


#### **(C) Collections edited or initiated**


#### **(D) Dissertations supervised, selected dissertations examined**


#### **(R) Reviews**

Schroeder, P. (1981). Review of: R. H. Wettstein, Eine Gegenstandstheorie der Wahrheit. Argumentativ-rekonstruierender Aktualisierungs- und Erweiterungsversuch von Kants kritischer Theorie (Königstein/Ts.: Forum Academicum 1980). *Dialectica* 35, 361–362. doi: 10.1111/j.1746-8361.1981.tb00789.x.


#### **(E) Encyclopedia articles**


#### **(M) Miscellaneous including selected unpublished manuscripts**


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## *Grundlagen der Arithmetik***, §17: Part 1. Frege's Anticipation of the Deduction Theorem**<sup>∗</sup>

Göran Sundholm

**Abstract** A running commentary is ofered on the frst half of Frege's *Grundlagen der Arithmetik*, §17, and suggests that Frege anticipated the method of demonstration used by Paul Bernays for the Deduction Theorem.

The Natural Deduction community owes Peter Schroeder-Heister a large debt of gratitude, not only for his seminal extension of the Natural Deduction techniques to higher-level derivations, and as the champion advocate of Proof-Theoretic Semantics.1 He has also been a tireless and successful organizer of successful workshops and conferences at Tübingen that, over the past quarter of a century, have served to put Natural Deduction frmly on the map as a viable alternative for proof-theoretic bookkeeping. It is a pleasure to contribute to a Profle volume devoted to his work, and in his honour. However, by the side of his Natural Deduction expertise, Peter Schroeder-Heister has also followed other lines of research; thus he has also dealt

Göran Sundholm

Institute for Philosophy, Leiden University, The Netherlands, e-mail: goran.sundholm@gmail.com

<sup>∗</sup> I have spoken on this material at the workshop *Mathematics and its Philosophy between the 18th and the 19th Century*, Amsterdam, July 13–14, 2018, and am grateful to Jamie Tappenden for insightful comments. I am also indebted to Dr. Joan Bertran-San Millan, of the Czech Academy of Science in Prague for helpful conversations on Frege's *Begrifsschrift* during the *2nd Prague Workshop on Frege's Logic*, July 1–2, 2019 that he organised, and on other occasions, as well as for his generous help with derivations in the *Begrifsschrift* system. I am also indebted to Matthias Wille for a clarifying account of how Frege passed from a Leibnizian epistemological framework in BS to a Kantian one in GLA. In recent years my Leiden colleague Maria van der Schaar has been a constant sparring partner on all things Fregean. The two referee reports that were ofered to me by the Editors were most helpful and I am obliged to the referees for their careful commentary.

<sup>1</sup> Schroeder-Heister's elegant extension of Natural Deduction was presented in (1984d), whereas the reference to (2018d) is self-explanatory.

seriously with the technical, proof-theoretical aspects of Frege's logic, in which work he has found few colleagues and fewer peers.

The Editors of the present volume extended to me an invitation for a contribution on Proof-Theoretic Semantics, and one may reasonably expect that many contributions to the present volume will deal with topics and issues chosen from within that feld. Accordingly a paper on a proof-theoretic theme in Frege might also be welcome, touching as it does on a longstanding topic of interest for Peter Schroeder-Heister.

#### **1 How to read the "Frege conditional"?**

Today, after customary predicate-logic standardization, the "Frege conditional"

is read as an implication ⊃ between propositions and , whereas an iterated Frege conditional (∗)

is accordingly read as the iterated implication

(i) (<sup>1</sup> ⊃ (<sup>2</sup> ⊃ (. . . ⊃ ( ⊃ ) . . .)),

or, after the fashion of Schütte (1951), as the tautologically equivalent formula

(ii) ¬<sup>1</sup> ∨ ¬2∨. . .∨¬ ∨ .

Such Schütte disjunctions correspond exactly to Gentzen sequents,

(iii) 1, 2, . . . , ⇒ ,

where, in Schütte's (ii), the negated 's match the antecedent formulae of Gentzen's sequent and , of course, is the succedent of the sequent (where more succedent side formulae are possible, as well). That the iterated Frege implication (\*) could also alternatively be read as a Gentzen sequent was made implicitly clear already in Schütte (1951), where reformulations of Gentzen's sequent calculi, and their syntactic cut-elimination theorems, are given using the representation (i) for the *intuitionistic* calculus, and (ii) for the classical version. This representation of classical sequents was further refned by Tait (1968), where instead of treating sequents as disjunctive propositions, they are now cast as disjunctively read sets of propositions.

The sequent rendering of iterated Frege implications was, to the best of my knowledge, explicitly formulated frst by the late Pavel Tichý (1988a, pp. 248–252), but also Franz von Kutschera (1996a) and Peter Schroeder-Heister (1997c; 1999b;

2002c; 2014b) have stressed the fact. Tichý made the connection especially clear also graphically by rotating the Frege conditional (∗) 90° clockwise, while altering the notation slightly, thereby producing a familiar result, namely the sequent (iii)

$$A\_1, A\_2, \dots, A\_k \Rightarrow C.$$

Tichý, and especially Peter Schroeder-Heister, further noted that the parallel between iterated Frege implications and Gentzen sequents works well, also down to quite a fne-grained level, regarding rules, axioms, and notations.2 Schroeder-Heister has also studied Frege's permutation argument from *Grundgesetze*, §10, jointly with Kai Wehmeier. His writings on Frege are listed in the references.

#### **2 Fact and condition**

I take as my text the frst half of *Grundlagen der Arithmetik*, §17:

Statt eine Schlussreihe unmittelbar an eine Thatsache anzuknüpfen, kann man, diese dahingestellt sein lassend, ihren Inhalt als Bedingung mitführen. Indem man so alle Thatsachen in einer Gedankenreihe durch Bedingungen ersetzt, wird man das Ergebnis in der Form erhalten, dass von einer Reihe von Bedingungen ein Erfolg abhängig gemacht ist. Diese Wahrheit wäre durch Denken allein, oder, um mit Mill zu reden, durch kunstfertiges Handhaben der Sprache begründet. Es ist nicht unmöglich, dass die Zahlgesetze von dieser Art sind. Sie wären dann analytische Urtheile, obwohl sie nicht durch Denken allein gefunden zu sein brauchten; denn nicht die Weise des Findens kommt hier in Betracht, sondern die Art der Beweisgründe . . .3

Here we must consider the proper interpretation of *Thatsache* and of *Bedingung*. The importance of *Thatsache* — "fact" — in this context goes back to Frege's

3 Instead of linking our chain of deductions directly to any matter of fact, we can leave the fact on one side, while adopting its content in the form of a condition. By substituting in this way conditions for facts throughout the whole of a train of reasoning, we shall fnally reduce it to a form in which a certain result is made dependent on a certain series of conditions. This truth would be established by thought alone or, to use Mill's expression, by an artful manipulation of language. It is not impossible that the laws of number are of this type. This would make them analytic judgements, despite the fact that they would not normally be discovered by thought alone; for we are concerned here not with the way in which they are discovered but with the kind of ground on which their proofs rest . . .

English translations of Frege's texts, as a rule, are of *very* uneven quality, though the recent scholarly translation of *Grundgesetze* makes for a pleasant exception here. The horror story of the many diverging renderings of *beurteilbarer Inhalt* in *The Collected Papers* is as discouraging an example as can be. Thus, against this background of inferior translations, I prefer to quote Frege in German. Those who wish to read Frege's *Grundlagen* in English can readily do so in the omnipresent Austin translation, and may beneft from Schirn's (2010) examination of the more recent efort by the late Dale Jacquette (2007).

<sup>2</sup> Per Martin-Löf (in conversation) was convinced that Gentzen knew Frege's writings. He pointed to the matching rules, the naming of rules and axioms, and the notations, especially Gentzen's use of capital Greek letters. In the absence of a direct citation, he considered the available indications about as strong evidence as one could expect to have for claiming that Gentzen knew Frege's *Grundgesetze*.

characterization of analytic and synthetic judgements in GLA, §3 that we shall have occasion to discuss in some detail. A *Thatsache* is not general, or "lawlike", but is particularized, with regard to a specifc individual or object: it is an *Einzelthatsache*. This distinction between fact and (general) laws is common to the tradition of demonstrative knowledge, from Aristotle's *Analytica Posterioria*, Bk I, Chapter 4, onwards, where we do not have demonstrative, or "scientifc", knowledge of individual facts concerning particulars. Frege, no doubt, was familiar with this tradition, from the *Posterior Analytics*, or from Trendelenburg's *Erläuterungen*, a work that most likely was put into his hands by Rudolf Eucken, his mentor at Jena, and later also neighbour across the Forstweg. Furthermore, a *Thatsache* is an item of knowledge and it will be asserted, rather than merely assumed.

Concerning *Bedingung* we take note of a little observed fact regarding the *implication* ( ⊃ ) between propositions and . It is customary to render this implication into the vernacular as the *conditional*(ization) "If , then " or ", on condition that ". Today, however, the customary vernacular rendering of propositions turn them *by defnition* into (referents of) *that clauses*: that snow is white, that grass is green, etc. . . .4 The *implication* accepts that-clauses *salva signifcatione*. When and are taken to be that-clauses, putting them on either side of *implies* works. For instance, "that snow is white *implies* that grass is green" is grammatically well formed. *Conditionalization* does not work, however, with that-clauses: "If that snow is white, then that grass is green" is nonsensical and contains too many main verbs. Here we need complete sentences: "if snow is white, *then* grass is green", or "snow is white, on *condition* that grass is green", both work perfectly well. If we wish to deploy the same propositions as above here, the conditional will have to be: "*If* is true, *then* is true", or " is true, on *condition* that is true", for instance, "that snow is white is true, on *condition* that grass is green is true".

Accordingly, *implication* and *conditional* are not the same. We have the proposition ⊃ and the conditional judgement "If is true, then is true", or equivalently, " is true, on condition that is true". We may also follow Gentzen, and allow also relations of *consequence* among propositions, in the form of "sequents", German *Sequenzen*, such as ⇒ , to be read: " follows from ", or, equivalently: " is a consequence of ".

Propositions, and thus also the implication ( ⊃ ), may be held to be true. Then "( ⊃ ) is true" is a judgement, as is "[ ⇒ ] holds": consequence holds from to . Similarly, the conditional "If is true, then is true" is not a proposition, but a conditional judgement in which truth under the condition that proposition is true, is ascribed to proposition : " is true, on condition that is true".

The three judgements, of respectively, truth of the implication proposition, conditional — "hypothetical" — truth of a proposition, and holding of consequence,

<sup>4</sup> Indeed, both Frege and Bolzano avail themselves of an *independent* notion of proposition, known as a (Fregean) *Gedanke* or (Bolzanian) *Satz an sich*. Frege's and Bolzano's independent "Platonic" notions of proposition are available in their respective ontologies and are made to serve as referents of *that clauses*, but they are not defned by that meaning-theoretical role. The customary current notion of proposition, on the other hand, is not language-independent, like those of Frege and Bolzano, but is *defned* as that for which a *that clause* stands.


are diferent.5 They have diferent meanings, but are *equiassertible*: when the assertion condition for one of these judgements is fulflled by means of a certain demonstration, so that one is entitled to make it, then so may the other two readily be fulflled as well. We should note that the judgement (iii) is a higher-level judgement of the kind that Peter Schroeder-Heister has introduced: it expresses the rule that we may pass to the judgement " is true" from the judgement " is true".6 Frege's terminological choices in the *Begrifsschrift*, to wit *Bedingung* and *Bedingungsstrich*, make it clear that also the conditional is a viable alternative reading of what today is known as the "Frege implication".

#### **3 A problem of GLA, §17**

After these prefatory observations, I now return to the quoted text from GLA, §17. Instead of linking a chain of inferences directly to an (individual) fact, Frege claims, one may instead consider the implication ⊢ ⊃ , where ⊢ is the conclusion of an inference drawn to ⊢ , from the fact ⊢ in question, because we may let ⊢ , that is the assertion that the fact holds, remain standing and use its content as a condition (*Bedingung*) in the implication ⊃ . The conclusion ⊢ , after all, can be readily obtained by modus ponens from the premises ⊢ and ⊢ ⊃ . If we carry out this transformation repeatedly, inserting antecedent conditions with respect to all the facts in a deduction, we obtain an implication

$$(\*)\qquad\qquad(E\_1 \supset (E\_2 \supset (\dots \supset (E\_k \supset C) \dots ))).$$

The demonstration of that draws upon the facts , = 1, . . . , , can then be cast in a "normal" form:

$$\frac{\begin{array}{c} \mathcal{D} \\ \vdash (E\_1 \supset (E\_2 \supset (\ldots \supset (E\_k \supset C) \ldots )) \quad \vdash E\_1, \vdash E\_2, \ldots, \vdash E\_k \\\hline k \text{ applications of modulus Ponens, each time dropping one fact } E\_i \\ \hline \vdash C \end{array}}{\vdash C }$$

where D is a demonstration of the iterated implication with the individual fact-contents as antecedents, and the other premises are assertions of the facts 1, 2,. . ., . Of

<sup>5</sup> The notions open and closed consequences, and the diference between them, are explained in my (1997, §7, pp. 205–209).

<sup>6</sup> In his dissertation (1981c) and the JSL article (1984d).

the implication (∗), which is obtained through the repeated additions of (singular) fact-contents as *Bedingungen* (conditions), that is, as antecedents of implications, since ⊢ , of course, we get ⊢ ⊃ as a theorem by an application of modus ponens from the *Begrifsschrift* axiom ⊢ ⊃ ( ⊃ ), and by repetition, also the implication (∗). We see that (∗) is trivially demonstrable, but Frege claims that it is *analytic*, that is, analytic in *his* own sense that was introduced in GLA §3, rather than that of Kant. Demonstrability is plain to view, since ⊢ is demonstrable, but analyticity, under whatever characterization of that notion, is not. That the implication in question is analytic, however, is not plain to view, under whatever characterization that we might use for analytic.

Frege's *Grundlagen*, §17, has met with astonishingly little commentary. I am aware of only three discussions. In his *Grundlagen* book Michael Dummett (1991c, p. 70) mentions "*the problem of the value of analytic judgements . . .*":

Frege . . . contents himself, with pointing out that it coincides with that of the fruitfulness of deductive reasoning, since we can always transform a piece of reasoning into an analytic truth by framing the conditional whose antecedent is the conjunction of the premises and whose consequent is the conclusion.

The problem of the fruitfulness of deductive reasoning is also known as the "*Paradox of Inference*" that results from the tension between the usefulness and the validity of an inference. The validity of an inference appears to presuppose that the conclusion is *already* "epistemically contained" in the premises, whereas its fruitfulness — its utility, if you wish — seems to demand that the conclusion *does go beyond* what is contained in the premises. Dummett dealt *in extenso* with this vexing issue in his 1973 British Academy Lecture on *The Justifcation of Deduction*. In the quotation above, he does not solve it, though. In particular, Dummett ofers no reason as to *why* the transformation works, no explanation as to why its fnal result should be *analytic*, nor does he explain why the problem of the value of analytic judgements coincides with the Paradox of Inference.

The components of a *Fregean* "piece of reasoning", that is, I take it, an *inference*, are judgements that carry assertoric force: every inferential component in sight carries an assertion sign that is composed out of a content stroke and a judgement stroke. They are not propositions, whence they *cannot* be conjoined into a conjunction the way Dummett wishes: the conjuncts of a conjunction(-proposition) have to be *propositions* and not (asserted) premise-judgements. Similarly, the consequent of an implicational "horseshoe" (⊃) proposition has to be a proposition, but not the asserted conclusion-judgement. The idea of drawing conclusions, from unasserted premises, that is, from "*open assumptions*" in the sense of Natural Deduction, to an unasserted conclusion, was anathema to Frege. Michael Dummett routinely held to the maxim: *Frege was (always) right*. There are, I think, only two major points where, according to Dummett, Frege was not absolutely right, and the frst, which need not detain us here, concerns truth-values in their role as objectual *Bedeutungen* of sentences.

The second point where Frege, according to Dummett, did not get it right was his construal of inference, and it took the genius of Gentzen to fnd the proper way to deal with this:

Frege's account of inference allows no place for a[n] . . . act of supposition. Gentzen later had the highly successful idea of formalizing inference so as to leave a place for the introduction of hypotheses . . .

#### Indeed,

[i]t can be said of Gentzen that it was he who showed how proof theory should be done.7

In this vein, Dummett uses the format of Natural Deduction from Gentzen's 1932 doctoral dissertation to criticise Frege. However, Gentzen's system, when read strictly according to the letter of the Hilbertian proof-theoretical school, is an uninterpreted one, and, for a fair comparison with Frege, it should be supplied both with meaning explanations for the primitive vocabulary, say, the arithmetical one, and signs that serve as force indicators, such as ⊢, that clarify the pragmatic dimension of his system.8 Already in my inaugural lecture (1988) at Leiden University I claimed that Bolzano (1837), the Tractarian Wittgenstein (1921), Tarski (1936), and Quine (1950), adopt what I (later came to) call the "*second Bolzano reduction*" of the epistemic notion *validity of inference* (from judgement to judgement) to the ontological — "alethic" — notion of (logical) consequence (between antecedent propositions and a consequent proposition). Subsequently, in (2006b) (that was presented at the 1999 Tübingen workshop on Proof-Theoretic Semantics, and published in its Proceedings) as well as in the survey article (2009), I came to hold that Frege was substantially right in his views on inference: the criticisms of his views on inference arise from taking him, via the Bolzano reduction, to be concerned with (logical) consequence among propositions, rather than with inference from judgement to judgement.9 fair comparison between Frege and Gentzen will have to make explicit also the pragmatic level and incorporate it into the Gentzen system, and it will force us, I argued in (2006b), to deploy Natural Deduction in its *sequential form* from that was introduced by Gentzen (1936).

Apart from Dummett's I have been able to fnd only two further comments on *Grundlagen*, §17. The frst was made by Ian Proops (2002, p. 293) in his article "The *Tractatus* on Inference and Entailment":

So when we have a derivation of ⊃ from basic logical laws, we have a case when the chain of deductions beginning with and ending with has been "reduced to the form in which" is dependent on . But since Frege is here in efect (tacitly) assuming one direction of the deduction theorem: (viz. if ⊢ , then ⊢ ⊃ ), this means that for him is dependent on whenever is derivable from . Moreover, if we regard Frege here as envisaging the process of "substituting conditions for facts and so on to be reversible (and so as in efect tacitly committed to both directions of the deduction theorem), we shall have a

<sup>7</sup> Dummett (1981b, p. 309, and p. 435, respectively), where also several textual references to Frege for his rejection of unasserted, merely assumed premises of inferences can be found.

<sup>8</sup> In (2006b), that is, my contribution to the frst workshop on Proof-Theoretic Semantics at Tübingen, 1999, I spell out this perspective on Gentzen's (1933a) framework in considerable detail and ofer a comparison with his re-casting of it as a sequential version of Natural Deduction in his "First consistency" paper (1936).

<sup>9</sup> The articles (2009) and (2006b) were drafted in 1998, and presented at meetings in Helsinki and Tübingen shortly thereafter. Unfortunately both had to wait an inordinately long time to appear in print.

case of the notion of dependence being cashed in terms of derivability: is dependent on if ⊢ ⊃ if ⊢ .10

The *Deduction Theorem* that Proops refers to is commonly credited to Herbrand (1930), and to Tarski (1930); *sub specie aeterni*, those attributions may be justifed, but the well-known demonstration of the Deduction Theorem by means of the rearrangement of a concretely given derivation was frst given by Paul Bernays in Hilbert and Bernays (1934a, pp. 157–158).11 Be that as it may, an appeal to the Deduction Theorem in this context cannot be right. From a Fregean perspective Proops' formula " ⊢ " does not make sense.12 In Frege's work, as amply borne out by, e.g. BS, §6, inferences all have asserted (judgements as) premises, and such

However, there are more Polish twists to our tale. Jaśkowski (1934c), in which article Natural Deduction appeared in print for the frst time, simultaneously with Gentzen's paper, opens as follows:

In 1926 Prof. J. Ł u k a s i e w i c z called attention to the fact that mathematicians in their proofs do not appeal to the theses of the theory of deduction, but make use of other methods of reasoning. The chief means employed in their method is that of an arbitrary supposition.

Jaśkowski also reports that he had spoken on this method of deduction in Łukasiewicz's seminar, also in 1926, and elsewhere. Łukasiewicz's connection to Natural Deduction is quite close and goes back a long way. In 1910, as *Privatdozent*, he spent time at Graz when Meinong was preparing his much-enlarged second edition of *Über Annahmen*. In that work Meinong considers *Annahmeschlüsse* ("assumption inferences"). Is this the source that in 1926 prompted Łukasiewicz's suggestion, and through it, the Polish branch of Natural Deduction?

12 Indeed, here in the quotation from Proops, the turnstile is used as a predicate for the holding of (syntactic) *consequence* between formulae and that here should be taken as meta-variables for wf's rather than as "sentence letters" in the propositional calculus. Frege's assertion sign ("content stroke plus judgement stroke") is applied to judgeable contents ("propositions") and will yield a judgement; Frege's *symbol*, but not its intended use, was retained by Barkley Rosser who turned it into a theorem predicate applicable to formal wf's. Kleene then extended this use to have also side-formulae to the left, when it has been turned into a predicate expressing syntactic consequence among formal wf's. Kleene (1952a, p. 88, p. 526) gives details and references.

<sup>10</sup> Proops' passage is part of a discussion of Frege's notion of dependence from the second series of articles on the *Grundlagen der Geometrie*. I have dealt with that Frege material at some length in connection with Bolzano in my (2000).

<sup>11</sup> Kleene (1952a, Ch. V, §§21–23, pp. 90–102) gives a meticulous exposition of Bernays' demonstration of the Deduction Theorem, as does Alonzo Church (1956b, Ch. III, §36, pp. 196–201). Kneebone (1963, Ch. 3, §5, pp. 78–86) follows Bernays' style of exposition closely, and unhurriedly explains the hurdles and pitfalls very well. The story behind the Deduction Theorem and the genesis of Natural Deduction is quite complex. Already in (1916) Lesniewski uses what amounts to the Deduction Theorem, or "*implication introduction*", in *informal* reasoning. Subsequently, in his Warsaw lectures, he used that mode of proceeding in formal derivations as well, while his student Tarski (1923; 1924a; 1924b) does so also in print. Tarski claims, in footnote † on p. 32 of the English translation of (1930) in Tarski (1956/1983), that in 1921 he established the Deduction Theorem for the formalism of *Principia Mathematica*, and he refers for a published proof to a location in a paper from 1933, translated at Tarski (1956/1983, p. 286). It should be noted that the footnote † occurs only in the English translation of (1930), but not in the original. The (1956/1983) edition(s) ought not to be seen as just a translation(s), since a number of its papers have undergone considerable editorial retouches. For serious historical work it is not reliable, and one is well advised always go back and check the text against the originals.

a passage, from an unasserted formula to an unasserted formula , simply is not possible, and nor is it contemplated by Frege.

#### **4 Ambiguity in logical notation**

The confusion is made worse owing to the threefold ambiguity involved in our current logical notation within Natural Deduction. A formula may serve in at least three diferent roles:


Against this background we can understand that confusions may arise when one attempts to slide a contemporary modern Natural Deduction grid over the works of Frege. However, owing to the peculiar absence of meaning in the formalized object languages of Gentzen, the resulting ambiguities will not be acutely felt there. His languages are not designed for use, but are there to be spoken about. Thus, when it comes to rendering a Natural Deduction derivation into the vernacular, opening a deduction, one customarily says: "Assume !" (with the intention of deriving ). This really introduces yet a fourth reading of the calculus letters, and one that is mainly concerned with manipulating ("metamathematical expressions", that is, a certain kind of) objects. But how, in the absence of meaning, can one "assume "? A well-formed formula is a *thing*, a "metamathematical expression" that does not itself express, but gets expressed using an expression. Owing to the absence of meaning it is not something that can be assumed (except in this derived, "symbol manipulating" sense). It is an object, an entity, that may be discussed, but it does not say anything. One does not assume, nor assert, objects such as Paris, 14, or BBC.

The frst of the three renderings above can also be ruled out: one does not assume a *proposition* , but one may, as in (3), assume that *the proposition is true*. However, one may also assume *that one knows, or is granted, that proposition is true*. Assumptions of the kind (3) are not used in Frege's works, whereas the rules of inference all use *epistemic assumptions* that *beurtheilbare Inhalte* are known (while prefxed with assertion signs). The modern notion of a predicate-logic proposition is the inheritor of Frege's *beurtheilbarer Inhalt*, and in Fregean inferences they are prefxed by the *Inhaltsstrich-cum-Urtheilsstrich*. The premisses of Fregean inferences are presupposed to be known. In order to validate an inference, rather than establish that consequence, be it logical or not, holds from antecedent proposition(s) to consequent proposition, that is, the Bolzano method, one assumes that one is granted assertions of the premise judgements, and will then have to take responsibility for asserting the conclusion judgment:

When I say "*Therefore*" I give others my permission to assert the conclusion on my authority, provided that they grant me their authority for asserting the premisses.

This view on the validation of inference steps, rather than on the holding of consequence, renders Frege's insistence on inferring from (possibly hypothetically) *known* premisses understandable and right. Criticism of Frege's view of inference commonly takes him to be speaking about the modern notion of *logical consequence* (among propositions), that is, holding in all variations, after the fashion of Bolzano, Wittgenstein's *Tractatus*, and Tarski. That, however, is a notion that, just as logical truth, plays no role in Frege's writings.

Finally, Wolfgang Kienzler (2009, §4.6.3) gave a rich epistemological analysis of the background to Frege's *Grundlagen* (including its §17) in the German translation of Mill's *Logic*, but does not enter into the technicalities of Frege's analyticity claim. Kienzler deserves praise for thus drawing attention to the Schiel translation of Mill. It constitutes a source that has not yet been exhausted by current Frege research. For instance, Mill's discussion of facts (*Thatsachen*) is likely to provide some insight into Frege's use of that term in GLA.

#### **5 "Frege analyticity"**

In order to understand and evaluate GLA, §17 we need to consider what Frege means by *analytic*. In Frege (and in Kant), "analytic" is applied to judgements, and not to propositions, that is, Fregean *beurtheilbare Inhalte*. The *Grundlagen der Arithmetik* is not the only work where Frege makes use of analyticity. Both the earlier *Begrifsschrift*, as well as the later essay *Über Sinn und Bedeutung* from 1892, make explicit and acknowledged use of the *Kantian* notion, rather than his own notion of analyticity from *Grundlagen*.13 The *Jäsche Logik*, to which Frege refers in the *Grundlagen*, is a likely source for his account of the Kantian notion of analyticity. However, only in *Grundlagen* does Frege make public use in print of his own notion of analyticity.

The modern (post-Quinean) discussion of analyticity takes it to be a classifcation of propositions, that is, of contents. An analytic proposition is there seen as one that is *logically true*, come what may, independently of what is the case, in or under, all "variations", or reduces to such a logically true proposition through defnitions. Bolzano, Wittgenstein, Quine, and Tarski are well-known adherents to (versions of) this view. That, however, is not how Frege views matters, and Quine, who brought the *logical truth under substitutions of synonyms* concept to the fore, and, even more so, Paul Boghossian, who, as far as I know, very deliberately coined the unfortunate misnomer "*Frege analyticity*", have to carry the responsibility.

Frege, *Grundlagen der Arithmetik*, §3:

1. Jene Untersuchungen von apriori und aposteriori, synthetisch und analytisch betrefen . . . nicht den Inhalt des Urtheils, sondern die Berechtigung zur Urtheils-

<sup>13</sup> *Begrifsschrift* (1879, §8, pp. 14–15) and *Über Sinn und Bedeutung* (1892, p. 25, p. 50), but also in the letter to Stumpf (published as being to Marty) from 29 VIII 1882.

fällung. . . . Wenn man einen Satz in meinem Sinne a posteriori oder analytisch nennt so urtheilt man . . . darüber worauf im tiefsten Grunde die Berechtigung der Führwahrhaltens beruht.

(These investigations of apriori and aposteriori, synthetic and analytic, are not concerned with the content of the judgement (i.e. the Thought, the proposition G. S.), but with the right to judge . . . When one calls a judgement made aposteriori or analytic in my sense, one judges . . . about whereupon the deepest ground for the justifcation of its afrmation rests.)

Frege deploys his versions of the Kantian dichotomies, analytic/synthetic and a priori/a posteriori at the level of the *justifcations* for the judgements made: they "are not concerned with the content of the judgement". *So much for Boghossian's "Frege analyticity"!*

In the case of mathematics that means we are now considering the demonstrations that serve to establish their conclusions, that is, the demonstrated theorems. Whether a theorem is analytic or not depends on the kind(s) of demonstration that it has, in particular on the starting points of the reasoning, that is, its *Urwahrheiten*.

2. Dadurch wird die Frage dem Gebiete . . . der Mathematik zugewiesen, wenn es sich um eine mathematische Wahrheit handelt. Es kommt nun darauf an den Beweis zu fnden und ihn bis auf Urwahrheiten zu zurückzuverfolgen.

(Thereby, when one is concerned with a mathematical truth, . . . the question is assigned to the domain of mathematics. The task now is to fnd the demonstration and trace it back to basic truths.)

The question as to whether a certain "mathematical truth" is analytic is accordingly a *mathematical* one: it is incumbent upon us to fnd its demonstration. *Urwahrheit* — "basic truth" — is Frege's favoured term for axioms. Frege here appears to presuppose that a "mathematical truth" has a demonstration, and that it can be found. Hence, perhaps somewhat surprisingly, and in contradistinction to latter-day realists, he is committed to the principle that mathematical truths can be known. In familiar current terminology, Fregean "mathematical truths" are — demonstrated or demonstrable *theorems*.

3. Stösst man auf diesem Wege nur auf die allgemeinen logischen Gesetze und auf Defnitionen, so hat man eine analytische Wahrheit, wobei vorausgesetzt wird, dass auch die Sätze mit in Betracht gezogen werden, auf denen etwa die Zulässigkeit einer Defnition beruht.

(When along this way one encounters only the general logical laws and defnitions, one has an analytical truth, whereby it is presupposed that also those theorems are taken into consideration upon which the admissibility of a defnition depends.)

The crucial step in Frege's treatment arrives when we have reached the "leaves" positions in the fully developed justifcatory, backwards-search tree. It is strongly reminiscent of Aristotle's treatment at the outset of the *Posterior Analytics*, Book I, Chapter 4: the ultimate principles should be *general, topic neutral*, and in Latin *per se* that is, knowable *in and from themselves*: "self-evident". All three are directly

at issue here in Frege's formulations. Just as Aristotle demands that the Principles of Demonstrative knowledge should be *general*, so does Frege. The leaf positions in his trees of justifcation — "grounding trees" — must be occupied by assertions with general propositions as contents. A single particular judgment in leaf position is enough to rule out Fregean analyticity.

#### **6 General logical laws: self-evident and topic-neutral**

In the Quotation (3) above Frege draws upon "the general logical laws". What are they? In 1884, Frege's short answer would be: the rules and axioms of my *Begrifsschrift*. In the light of this ready answer, Frege's "*Theory of Consequence*", that is, his treatment of the topics in the *Prior Analytics*, does not match Aristotle's in its answers to questions about consequence. Frege's form of judgement is diferent from Aristotle's *belongs to* , so their respective answers to the question of consequence: *What follows from what?* are diferent. His *Theory of Demonstration*, though, is taken straight out of the *Posterior Analytics*, and his answer to the question: *What is a demonstration* ("proof")? is very Aristotelian. The generality constraint on the *Urwahrheiten* of a mathematical demonstration, as well as their topic neutrality and applicability across the board, irrespective of subject matter, are taken straight out of Aristotle's account in the *Posterior Analytics*.

Also the third Aristotelian criterion, namely, *perseity*, we may fnd in Frege. Aristotle has four kinds of perseity and the two frst are of interest to us.14 Perseity of the frst kind concerns a "belonging" of the form belongs to and holds when the predicate is part of the "*formula*" (*logos*) of the subject . The knowledgeable reader will here have recognized a version of Kantian analyticity: the predicate is contained in the notion of the subject . Aristotle applies perseity to "belongings", and in the Scholastic version of his account, say in the *Summa Theologiae* of St Thomas Aquinas, Q2, art. 2, where the "belonging" belongs to becomes the predication is , there is now talk of a *propositio per se nota*, that is, a judgment known in, or from, itself. Thomas elucidates this by saying that a *propositio per se nota* of this frst kind, where "the predicate lies contained in the subject", is also *self-evident*, in the sense of *quae statim, cognitis terminis, cognoscuntur*, that, is *known as soon as their terms are known*.15

The second of the four Aristotelian kinds of *per se* connection between terms is exemplifed by "even" and "number": number must occur in the *logos* of a term to which one may apply "even". Today we would say that " is a number" is a *presupposition* for the *meaningfulness* of " is even": only if " is a number" does " is even" make sense. Aristotle also gives *straight* and *oblique* as a *per se* connection of the second kind to *line*: the meaningfulness of " is oblique" presupposes that " is a line". This second kind of *perseity*, I suggest, is the source of Frege's rider on

<sup>14</sup> An. Post, I;4, 73a35 f.

<sup>15</sup> Aquinas ST Q2:art2, SCG Ch. 10.

defnitions in Quotation 3 above: "it is presupposed that also those theorems are taken into consideration upon which the admissibility of a defnition depends". Without it being known that is a number, the defnition of " is even" would be inadmissible. The *propositio per se nota* of Thomas Aquinas is matched by Frege with certain items of that are known from meaning alone, be it a self-evident judgment or knowledge of presuppositions.

In §5 Frege uses a well-known characterization of axioms: *unbeweisbar und unmittelbar klar*, that is axioms should be *indemonstrable and immediately clear*. The clarity in question is familiar from Descartes' characterization of the notion of *evidence*.16 Evident knowledge has to be "clear and distinct". The immediacy here is not temporal — an axiom, that is, a self-evident judgement, need not be obvious, or at all easy to grasp. It is immediate in that it has no predecessors in the epistemic order. The immediacy terminology stems from Aristotelian syllogistic. An " is " judgement is immediate when there is no middle term to be found that could mediate between the major and minor terms in order to syllogize.

We also fnd several occurrences of another terminological item from traditional epistemology in GLA, §5, namely *einleuchten*, which is notoriously difcult to translate into English.

Und ist es dann *unmittelbar einleuchtend*, dass 135664 + 37863 = 173527 ist?

wie sollen sie anders eingesehen werden als durch einen Beweis, da sie *unmittelbar nicht einleuchten*?

so müsste die Richtigkeit unserer Gleichung *sofort einleuchten*

durch die Anschauung *unmittelbar einleuchten*

(*my emphasis* G. S.)

#### and in §90

weil der Mathematiker zufrieden ist, wenn jeder Übergang zu einem neuen Urtheile als richtig einleuchtet, ohne nach der Natur dieses Einleuchtens zu fragen, ob es logisch oder anschaulich sei.

The novel Ebert-Rossberg translation of GGA uses "obvious" here. I would prefer *evident*, but not *self-evident*, since *einleuchten* is coupled with *unmittelbar* and with *sofort*. Obviousness is always immediate: the obviousness of what is obvious should also be obvious, and so "immediately obvious" has a pleonastic ring to it, whereas "*unmittelbar* self-evident" *is* pleonastic. The evidence (of what is evident), on the other hand, can be immediate, but also *mediate*.

Es handelt sich um mein Grundgesetz (V). Ich habe mir nie verhehlt, dass es nicht so einleuchtend ist, wie die andern, und wie es eigentlich von einem logischen Gesetze verlangt werden muss.17

<sup>16</sup> *Evidence* — Dutch *klaarblijkelijkheid* — in the sense *evidence of* (what is evident) but not *evidence for* (a claim). English is the only language that uses *evidence for*, particularly in legal contexts; nevertheless, also in the OED, *evidence (of)* is given as the frst meaning.

<sup>17</sup> GGA, Vol II, p. 253.

To demand *obviousness* of an axiom is asking too much. It need not be at all easy to grasp the terms out which an axiom is built, or the way they contribute to render self-evident the result of joining them together in that particular way. It, as an axiom, must be *self*-evident, but that evidence need not be obvious, or at all easy to fathom: it can demand considerable experience in working with the concepts out of which an axiom is built before the penny drops and one realizes its self- evidence. As remarked already the immediacy (*Unmittelbarkeit*) in question is conceptual, but not temporal. *Temporal* immediacy, on the other hand, Frege marks by the use of *sofort*.

Wenn es nicht möglich ist, den Beweise zu führen ohne Wahrheiten zu benutzen, welche nicht allgemein logischer Natur sind, sondern sich auf ein besonderes Wissensgebiet beziehen, so ist der Satz ein synthetischer.

(When it is not possible to conduct the demonstration without using truths that are not of a general logical nature, but pertain to a particular domain of knowledge, the theorem is a synthetic one.)

Here Frege's Aristotelian background comes to the fore. His *Theory of Consequence* — "what follows from what?" — is not Aristotelian since the form of judgement in the Fregean ideography — *ist eine Thatsache* — is diferent from the Aristotelian " belongs to ". The diference between analytic and synthetic judgments is based on what laws are used in leaf-positions in the trees of demonstration. An analytic judgement may draw *only* upon general principles that are, in Gilbert Ryle's happy phrase, "topic neutral", that is, applicable, within all felds of inquiry, irrespective of subject matter. When the demonstration makes use of a general law of restricted applicability that holds only within a specifc area of scientifc inquiry, say geometry, or biology, the demonstrated theorem is synthetic. Frege's formulations, when explaining his dichotomy analytic and synthetic, really pertain to demonstrations, rather than to their theorems. Hence, it cannot be ruled out that one may fnd also an analytic demonstration for a judgement that has been established through a synthetic demonstration. The limitations imposed on synthetic judging stem from the Aristotelian need to avoid *metábasis eis állo génos*, that is, topic transferring principles from one domain of scientifc discourse to another, where their applicability is not uncontested, for example, trying to use biological principles for establishing geometrical theorems.

Damit einen Wahrheit aposteriori sei, wird verlangt, dass ihr Beweis nicht ohne Berufung auf Thatsachen auskomme; dass heisst, auf unbeweisbare Wahrheiten ohne Allgemeinheit die Aussagen von bestimmten Gegenstände enthalten.

(For a truth to be a posteriori, it is required that its demonstration cannot be conducted without recourse to facts, that is, indemonstrable truths without generality that contain statements about certain objects.)

A judgement is *a posteriori* if it cannot be established without recourse to individual, specifc facts. Both analytic and synthetic judgements have to be demonstrated by logical means alone from *general* laws, and in virtue of this they are *a priori*.

#### **7 Axiom** *eines Beweises weder fähig noch bedürftig*

Ist es dagegen möglich, den Beweis ganz aus allgemeinen Gesetzen zu führen, die selber eines Beweises weder fähig noch bedürftig sind, so ist die Wahrheit apriori.

(If, to the contrary, it is possible to conduct the demonstration entirely from general laws that neither admit nor need demonstration, the truth is a priori.)

We should note here that the arresting phrase "*eines Beweises weder fähig noch bedürftig*" is not original with Frege, but places him squarely in the rationalist epistemological tradition. Scholarly opinion varies as to where Frege took it from. Thus Gottfried Gabriel holds that the "formulation is an acknowledged quotation from Lotze":

Sätze . . . deren Gültigkeit für uns unmittelbar feststeht, die daher eines Beweises weder bedürftig noch fähig sind.18

I have not been able to fnd such an acknowledgement in Frege.

However, by the side of Lotze's *Logik*, there are several more places known to Frege, from where he could have taken the formulation. Thus, according to Tyler Burge, Leibniz in the *Noveaux essais*, Book IV, Ch. 9, §3 is a likely source:

The assumption that axioms are basic laws or basic truths can be explicated in terms of Frege's characterization of primitive general laws as being "*neither capable nor in need of proof* " (FA §3). This phrase comes directly from Leibniz, from whom Frege probably got it (Leibniz [Nov. Es.] IV,ix,2).19

(*my emphasis*, here and in the other quotations G. S.)

Leibniz's French reads very much the same as its English translation:

une évidence entière qui *n'est point capable provée et n'en a point besoin*. (Erdmann, 373)

This is a comment on Locke's *An Essay Concerning Human Understanding*, Book IV, chapter xvii, §14, where we read:

And this, therefore, as has been said, I call *Intuitive Knowledge*; which is certain, beyond all Doubt, and *needs no Probation, nor can have any*; this being the highest of all Humane certainty.

However, Frege's main philosophical source for his *Grundlagen*, namely Baumann's *Die Lehren von Raum, Zeit und Mathematik in der neueren Philosophie*, contains the above passage from Locke in German translation:

So nimmt der Geist wahr, dass ein Bogen des Kreises kleiner ist als der ganze Kreis, ebenso klar, wie er die Idee des Kreises wahrnimmt; und dies demnach nenne ich anschauende Erkenntnis, welche gewiss ist, über alle Zweifel erhaben, *keines Beweises bedarf und keinen haben kann*; dies ist die höchste aller menschlichen Gewissheit.20

<sup>18</sup> Gabriel (2002, p. 47), refers to Lotze (1874, §200). I have not been able to fnd any direct quotation of Lotze's §200 in GLA. Gabriel and Schlotter (2017, Ch. 4, §3) ofer more details as to Frege and Lotze on the status of logical laws.

<sup>19</sup> Burge (1998, p. 313). My *emphasis*, here and in the other quotations.

<sup>20</sup> Baumann (1868, Band I, p. 362).

Apart from the one quoted by Burge, other Leibniz passages (from the same source) are given by Baumann:

Die unmittelbare Apperception unseres Daseins und unserer Gedanken liefert uns die ersten Wahrheiten a posteriori oder thatsächlicher Art, d. h. die ersten Erfahrungen. Sie sind *unfähig bewiesen zu werden* und können unmittelbar genannt werden, — weil Unmittelbarkeit zwischen Verstand und Object bei ihnen stattfndet.21

and Baumann himself also uses the formulation freely:

Dies Prinzip, das ich soll erschlichen haben, ist das vom Bedürfnis eines zureichenden Grundes, damit eine Sache existiert, ein Ereignis eintritt, eine Wahrheit statt hat. Ist das *ein Prinzip, das Beweise bedarf* ?

(Erdmann p. 778, 125.)

#### and

Der Gegensatz ist einleuchtend: Leibniz stellt das Prinzip [i.e. Satz vom Grunde] in der ihm eigenthümlichen Fassung auf, rundweg auf *als keines Beweises bedürftig* . . .22

Against the background of such plenitude, Baumann seems to me a more likely source than either Lotze or Leibniz.

At time when Frege was writing the *Begrifsschrift*, and his *Grundlagen*, Brentano lectured on *Logik* in Vienna:

Man kann das evidente Urteil als solches bezeichnen, welches in sich als richtig charakterisiert ist. Solche Urteile sind *keines Beweises fähig und keines Beweises bedürftig*.23

This Frege will not have known, but it shows how widespread formulations in terms of *fähig* and *bedürftig* have already become by 1880.

To my mind, none of the above, however, was Frege's source for his formulation:

Alle Gewissheit ist entweder eine unvermittelte oder eine vermittelte, d. h. sie bedarf eines Beweises, oder ist *keines Beweises fähig oder bedürftig*.

Apodicticity is either immediate or mediate, that is, it stands in need of a demonstration, or it is neither capable nor in need of demonstration.

In this quotation we have all three components *Beweis*, *fähig* and *bedürftig* in close juxtaposition, which are not found in the quotations from Baumann.

The author is Kant, and the place is the *Jäsche Logik*, Ch. IX *Logische Vollkommenheit der Erkenntnis*, A108-109, a work we know that Frege read and quoted at the time of writing of *Grundlagen*. The English translators (1988, p. 79) give:

"All certainty is either *mediated* or not *mediated*, that is, it either requires proof or is neither susceptible nor in need of any proof."

<sup>21</sup> Baumann (1868, Band II, p. 225).

<sup>22</sup> Baumann (1868, Band II, p. 292, and 293, respectively).

<sup>23</sup> Brentano (1956, §29, art 85, g, p. 111). The fragment in question stems from a set of notes for logic lectures delivered in Vienna in the second half of the 1880's.

I defnitely prefer *immediate or mediate* over their choice, as well as the term *demonstration* (that is cognate to *Beweis*) rather than "proof", which is related to *tests* (German *prüfen*, *Prüfung*) rather than to *Beweise*, and is cognate with *probe* and *approve*. *Gewissheit* is notoriously difcult to translate into English. The fne verb *to wit* (that is cognate to *wissen*) has been jettisoned, which leaves us a term short for convenient translation of *scire* (French *savoir*, German *wissen*), since *know* obviously has to translate *cognoscere* (French *connaitre*, German *kennen*). Upon refection, the best I can do is translating *gewiss* with *apodictic*, which yields the apposite *apodicticity* for *Gewissheit*, and is in harmony with Frege's adoption of Kantian terminology, at BS, §4, concerning *apodictic* and *assertoric* judgements.

Kant returns to the formulation in §33 as well:

Demonstrable Sätze sind die, welche *eines Beweises fähig* sind; die keines Beweises fähig sind, werden indemonstrable genannt.

Demonstrable statements are those which are capable of demonstration; those not capable of demonstration are called indemonstrable.

That starting points of demonstrations are not capable of proof is clear; they are indemonstrable. But on the other hand, nor are they in need of demonstration, since they are self-evident; as soon as their terms are known, according to the Thomistic formula quoted above, they can be known as such.

#### **8 The** *Begrifsschrift*

With the beneft of hindsight one may fnd the genesis of Frege's approach to analyticity in the *Preface* to the *Begrifsschrift*. Already here Frege runs the Aristotelian regress of asking for justifcations, and the grounding of his judgements eventually bring him back to frst principles, in much the same way that we came across previously in GLA, §3, above. Frege is concerned to place demonstrative "scientifc" truths truths of *reason* — after a Leibnizian fashion, at one side of a *dichotomy*, with truths of *fact* as the other side. When the Leibnizian resolution, of replacing expressions by their defnitions, comes to an end it yields an a priori demonstration of its is conclusion; and the leaf positions at which the resolution stops are "*judgements of identity*": An is , or an is .24 In this case the truth is a priori and necessary. In the case of an empirical, contingent " is " judgment the *resolution*—Latin for Greek *analysis*—will not come to an end, because if it did we would read the resolution tree as a demonstration, in the opposite direction, from top to bottom, whence the judgement would be rendered necessary. The terms of a contingent judgement are fully available only to God (by direct acts of grasping). God does not come to know discursively through analysis, but we humans will have to relay of empirical research, such as acts of perception, in order to come to know the judgement in question. The analysis does not work here, since owing to infnite complexity, it

<sup>24</sup> The fragment *Primae veritates*, Eng. tr. "First Truths", in: Leibniz (1973, pp. 87–92), gives a clear exposition of Leibnizian resolution.

will be non-terminating, and does not reach primitive terms and the concomitant judgements of identity. However, even though Frege does draw attention to analyticity, in §8, and in §24, there it is not his own later notion that is brought into play but the Kantian one.

Frege, Preface to *Begrifsschrift*:

Das Erkennen einer Wissenschaftliche Wahrheit durchläuft in der Regel mehrere Stufen der Sicherheit.

(The *recognition* of a *scientifc* truth, as a rule, runs through several levels of *certainty*.)

Note that Frege uses *Sicherheit*, and not *Gewissheit*. Here it is the degree of conviction that is at issue, and not absolute apodicticity; it may be diferent, more or less strong, at diferent stages in scientifc inquiry. The desired apodicticity is reached only after a demonstration has been given:

Das apodiktische Urtheil unterscheidet sich von assertorischen dadurch, dass das Bestehen allgemeiner Urtheile angedeutet wird, aus denen der Satz geschlossen werden kann, während bei der assertorischen eine solche Andeutung fehlt.25

(The apodictic judgement difers from the assertoric judgement in that the existence of general judgments is indicated, from which the sentence may be inferred, while for the assertoric there is no such indication.)

This is very close to what becomes a priori judgments in the *Grundlagen*. The tree of justifcations ends in general laws, but they are not required to be topic-neutrally applicable across the board, as will be the case in GLA, for analytic judgments.

Die festeste Beweisführung ist ofenbar die rein logische, welche, von der besonderen Beschafenheit der Dinge absehend, sich allein auf Gesetze gründet, auf denen alle Erkenntnis beruht.

(The frmest mode of *demonstration* is patently the purely logical one, which, disregarding the particular nature of objects, grounds itself only upon those *laws* on which all *knowledge* depends.)

How, we may wonder, does Frege know that there is a *frmest* mode? Perhaps all modes of demonstration are equally frm, or given a mode of demonstration there is always a frmer to be found? A formulation along the lines: *There is no frmer mode . . .* might seem more prudent. Furthermore, if frmness be related to degree of conviction, Frege's claim is highly doubtful, at least to me. A trivial computation, as taught in frst grade elementary school, establishes 25 + 42 = 67 and it will produce considerably more conviction than a huge derivation in predicate logic, involving, at least, 67 variables of quantifcation; with numbers above 1000 the computations are still perfectly feasible and can be kept under full control, whereas the corresponding derivations will sorely tax the patience and meticulousness of even the most diligent computist. Crucially important here is Frege's insistence on the topic-neutral applicability of logical laws.

Already at the very outset of his *Begrifsschrift* Frege isolated the special character of knowledge that is obtained from purely logical, general and topic-neutral laws.

<sup>25</sup> BS, §4.

Wir theilen danach alle Wahrheiten, die einer Begründung bedürfen, in zwei Arten, indem der Beweis bei den einen rein logisch vorgehen kann, bei den anderen sich auf Erfahrungsthatsachen stützen muss.

(We then divide all truths that demand a justifcation into two kinds such that for one the demonstration can proceed purely logically, while for the other it has to rest upon facts of experience.)

This is the Leibnizian dichotomy of (necessary) *truths of reason* versus (contingent) *truths of fact*. We should, however, note that not every truth demands a justifcation. Some truths, it would appear, are *per se* nota and can be known of, or from, themselves. For truths that demand justifcation, on the other hand, the dichotomy between *purely logical* versus *relying on experiential facts* applies, but in every case it is possible to (come to) know the truth as such. *Fregean truths are knowable!*

However, even though Frege does draw attention to the analytic/synthetic distinction also in *Begrifsschrift*, §8, there it is not his own later notion that is brought into play, but the Kantian one. To a name there is associated a *Bestimmungsweise* — "mode of determination" — of a content.26 Two diferent names and may be associated with diferent modes of determination that yield one and the same content. Then ⊢ ≡ , that is, the judgement of *Inhaltsgleichheit* between and , is a *synthetic* one in the sense of Kant. To this way of reading the *Inhaltsgleichheit* sign "≡" as expressing that the associated *Bestimmungsweisen* determine identical contents. (In later modern terminology that is inspired by SuB, the names and are *co-referential*.) To this co-determinability of the associated *Bestimmungsweisen*, in §24, Frege adds defnitional equality as a second reading of the sign "≡".

Frege returns to his example at the opening of SuB:

 = und = sind ofenbar Sätze von verschiedenem Erkenntniswert: = gilt *a priori* und ist nach Kant analytisch zu nennen, während Sätze von der Form = oft sehr wertvolle Erweiterungen unserer Erkenntnis enthalten und *a priori* nicht immer zu begründen sind.

( = und = are patently sentences of diferent worth for the cognition: = holds *a priori* and according to Kant is to be labelled analytic, while sentences of the form = often contain very valuable amplifcations of our knowledge and are not always to be justifed *a priori*.)27

The Leibnizian organization of the material in *Begrifsschrift* gives way to the

<sup>26</sup> Subsequently, in FuB, and SuB, the mode of determination/content dichotomy, with respect to names, is replaced by the distinction between the *Sinn* and the *Bedeutung* of a name. The abstract discussion here in BS, §8, is there sharpened by consideration of the trivial 2 <sup>4</sup> = 4 2 arithmetical theorem and the famous example concerning the planet Venus as Morning Star and Evening Star. The *Bestimmungsweise* (of a content) in BS is turned into the *Art des Gegebenseins* (of a *Bedeutung*) in SuB.

<sup>27</sup> Apart from the notorious difculty of how best to render *Bedeutung* into English, the opening passage of SuB into English, poses tricky challenges for the translator. I prefer *worth* over *value* in the translation of *Erkenntniswert*: the "cognitive value" that is preferred in the usual translations, to my mind, has a philosophical contemporary technical ring to it that is absent in the German. Furthermore, the choice to use *extension* for Frege's *Erweiterung* obscures the obvious Kantian context of the example. Here the proper English translation of *Erweiterung* is not *extension*, but *amplifcation*, being cognate with *ampliative*, as used in the standard translation of Kant's *Erweiterungsurteil*.

Kantian framework of *Grundlagen*. In Frege's letter to Marty(?) of August 29, 1882, the transition seems to have been efected.

Ich habe jetzt ein Buch nahezu vollendet, in welchem ich den Begrif der Anzahl behandele und nachweise, dass die ersten Sätze über das Zählen der Zahl , die man bisher als unbeweisbare Axiome anzusehen geneigt war, sich nur mittels der logischen Gesetze aus Defnitionen bewiesen lassen, sodass sie im Kantischen Sinnen wohl als analytisch Urteile zu betrachten sind.

(I have now nearly completed a book in which I treat the concept of Number and demonstrate that the frst theses on computation that one has hitherto tended to regard as indemonstrable axioms can now, so that they should be regarded as Kantian analytic judgements.)

Frege's gloss on analyticity — "be demonstrated from defnitions by means of the logical laws alone" — reads as a summary of his GLA, §3, account, and his choice of a Kantian stance is made a few sentences further down in the same letter:

Denn, während Leibniz [die Kraft des diskursiven Denkens] wohl überschätzt hat, indem er alles aus Begrifen bewiesen möchte, scheint mir Kant umgekehrt die Bedeutung der analytischen Urteile zu gering zu achten, indem er sich an zu einfachen Bespiele hält. Ich sehe es als ein grosses Verdienst Kants darin dass er die Sätze der Geometrie als synthetische Urteile erkannt hat, aber ich kann ihn für die Arithmetik nicht folgen.

(For while Leibniz may well have overestimated [the power of discursive thought] when he wished to demonstrate everything from concepts, Kant on the contrary seems to me to place too low an estimate on the signifcance of analytic judgements because he sticks to too simple examples. I consider it as greatly to Kant's credit that he recognized the theorems of geometry as synthetic judgements, but I cannot follow him regarding arithmetic.)

Frege's assertion that the laws of Number as he describes them are analytic *in the sense of Kant* is hard to understand. What Frege described is much closer to his *own* conception of analyticity. Furthermore, a Kantian analytic judgement is of " is " form, where the predicate is "contained in" the notion of the subject, whereas Frege's arithmetical laws are not readily cast in such a form. On the other hand, in GLA, §3, p. 3, footnote \*, Frege claims that he does not wish to introduce a novel sense here, but only wants to hit what earlier writers, in particular Kant, have meant. Be that as it may: Frege's §3 account of analyticity appears be substantially diferent from Kant's.

There remains also to be answered the vexing question as to why Frege, after having gone to some length in order to set up his own novel version of analyticity never uses it again after *Grundlagen*.28 Today it is a commonplace that Frege's "Logicism" began with the *Begrifsschrift*. The traditional current formulation of Logicism was codifed by C. G. Hempel in (1945, §10):

[T]he *thesis of logicism concerning the nature of mathematics*:

Mathematics is a branch of logic. It can be derived from logic in the following sense:

a. All the concepts of mathematics, i.e. of arithmetic, algebra and analysis, can be defned in terms of four concepts of pure logic.

<sup>28</sup> In spite of Frege's disingenuous declaration, in the footnote GLA, §3, that he did not wish to endow the Kantian terms with novel sense, surely he must have been aware that that is precisely what he has done.

b. All the theorems of mathematics can be deduced from those defnitions by means of the principles of logic (including the axioms of infnity and choice).

Mathematico-philosophical staple diet though it may be, and as admirably lucid it is, it does not match Fregean reality. Hempel's folklore picture is painted with the beneft of hindsight, after Frege's informal, non-technical presentation was ofered in GLA, and draws extensively on the elaboration provided by Russell in his (post *Principia Mathematica*) *Introduction to Mathematical Philosophy*.

In *Begrifsschrift* Frege's Logicism had not yet been formulated. In the *Grundlagen* it takes the form of showing that the arithmetical theorems are analytic in the sense he has so painstakingly spelled out in §3. This primarily is an epistemological investigation, since analyticity is, after all, an epistemic notion, but the attempted execution of that programme also had ontological consequences.29 The aim of the Begrifsschrift was epistemological. Frege wanted to have a means of ensuring that chains of inference were *lückenlos*, that is, "gap-free", and he hoped to fnd one by using a formal language, with the allowable forms of inference clearly delineated and circumscribed.30 Codifying logic in such a fashion one could be certain that no extraneous assumptions, or modes of inference, were illegitimately deployed in the course of demonstrations. Perhaps the BS system does not comprise all valid inferences, nor all self-evident axioms, but Frege, from his point of view, could claim

<sup>29</sup> The, admittedly very neat, customary modern arrangement of the concepts *analytic*, *a priori*, and *necessary* as being, respectively, *semantic*, *epistemic*, and *metaphysical* in character, is post WW II. The frst two labelling's gained prominence with Quine (1953), and its fnal codifcation is due to Saul Kripke in *Naming and Necessity* from 1970.

<sup>30</sup> The term *lückenlos* has recently had a good innings, as "gap-free", in modern Philosophy of Mathematics, but then applied to *demonstrations*. Frege used if for *chains of inferences*, and his use was proper, since a chain of inferences remains a chain of inferences, even though it may contain a gap. The modern use is incoherent: *ein lückenhafter Beweis*, that is, a *gappy demonstration*, is a contradiction in terms. *Lückenhaft* applied to demonstrations is not a qualifying term, but a *modifying* one: a demonstration *with gaps* is no demonstration. Similarly *invalid* is such a modifying term with respect to demonstrations. Such modifying terms come in pairs comprising a modifying term and a restorative one, for instance *false* and *true*, say, when applied to friends. "His behaviour made me think he was a false friend, but then I learned the full background and understood that he was a true friend after all." Similarly *invalid* (*incorrect*), and *valid* (*correct*), when applied to demonstrations. The dichotomy correct/incorrect is not a proper partition of demonstrations. We give a demonstration and that is enough; we do not then have to demonstrate as a further theorem that the demonstration we have given is correct (valid). If and when a challenge is made towards the demonstration in question, we may legitimately use the term *incorrect* (or *invalid*) when formulating our retraction, or as the case may be, use the term *valid* when stating that the criticism of the demonstration was illfounded, because it is a valid demonstration after all. *Validity applied to inferences* and *validity applied to demonstrations* are diferent notions. When a challenge is made towards a demonstration, the presupposition for using a modifying term has been met and we may use the term *incorrect* (or *invalid*) without incoherence, when formulating our retraction, or, as the case may be, use the term *valid* when stating that the criticism of the demonstration in question was ill-founded, because it is a *valid* demonstration after all. The term *lückenlos* is the *restitutive* term that matches the modifying term *lückenhaft*.

A more elaborated exposition of these matters relating to diferent meanings of *validity* can be found in my (2019).

this much: if something has been derived by (mechanically) following the rules of the *Begrifsschrift* system, the theorem thus derived is correct.31

With respect to the question as to why Frege did not use his own notion of analyticity, it is easy to answer for BS. In 1879 Frege had not yet formulated his own version of analyticity, so it obviously was not available for him to use in the *Begrifsschrift*. We saw it *in nuce* at work in the letter to Marty(?) from 1882. Accordingly, between 1879 and 1882, Frege formulated his own version of analyticity. His use of the Kantian notion in BS is not without its own difculties. Frege's form of judgement rules out the use of the Kantian " is " containment explanation, and the other characterizations of being elucidatory only, but not ampliative, and resting solely on the Law of (Non-)Contradiction are not explicitly spelled out by Frege in *Begrifsschrift*. The further characterization used by Locke, Leibniz and St Thomas Aquinas as *trifing* judgments (Locke), *propositions frivoles* (Leibniz) and *nugatoriae* (Thomas) are not discussed by Frege, and with good reason. Frege most defnitely does not want those characteristics since his analytic judgments are meant to be ampliative. During the writing of SuB that was published in 1892, Frege nevertheless uses the Kantian version of analyticity, and not his own. That is because of a major diference between him and Kant regarding analyticity. Thus, in Frege's extensive list of contents, GLA §99 carries the description *Kants Unterschätzung der analytischen Urtheile*, that is, "Kant's disparagement of analytic judgements". According to Frege analytic judgements are not void of epistemic worth; they are not *folgeleer*. True analytic identities, when the predicate is contained *impliciter* in the subject, may well be ampliative, contrary to the Kantian characteristics of analytic judgements.32 In this context we should also take note of the III:rd of Frege's Habilitation Theses from 16.5.1874:

III. Zahl ist nicht ein ursprünglich Gegebenes, sondern läßt sich defnieren.33 (Number is not a primitive given, but can be defned.)

#### **9 The general laws of logic in** *Begrifsschrift* **and the transformation in** *Grundlagen***, §17**

The BS laws of logic that according to Frege are meant to yield the desired apodicticity with respect to the theorems derived are the universal closures, with respect to quantifcation of what today is called frst and second order, regarding the following Urwahrheiten.34

<sup>31</sup> "Begrifsschrift" is an overburdened term in Frege research. I use *Begrifsschrift* for Frege's booklet. Following Jonathan Barnes (2022) I use ideography for its language, and BS system for the deductive apparatus in which the ideography is deployed.

<sup>32</sup> The *Jäsche Logik*, §37 *Tautologische Sätze* reads like a preliminary study for the opening paragraph of Frege's article SuB.

<sup>33</sup> Kreiser (2001, p. 123).

<sup>34</sup> In BS Frege has only one kind of quantifcation that is deployed indiscriminately with respect to all kinds of domains.

Frege's Anticipation of the Deduction Theorem 75

$$\begin{array}{llll} \text{Ll:} & A \supset (B \supset A) & \text{L} \mathsf{f} \colon A \supset \neg A\\ \text{L}\Sigma \colon (C \supset (B \supset A)) \supset ((C \supset B) \supset (C \supset A)) & \text{L}\Im \colon (c \equiv d) \supset (F(c) \supset F(d))\\ \text{L}\Im \colon (D \supset (B \supset A)) \supset (B \supset (D \supset A)) & \text{L}\Lambda \colon (c \equiv c)\\ \text{L}\Lambda \colon (B \supset A) \supset (\neg A \supset \neg B) & \text{L}\Re \colon \forall a F(a) \supset F(c), \text{ for any } \mathbf{c} \\ \text{L}\Lambda \colon \neg \neg A \supset A \end{array}$$

The permitted rules of inference are

(MP) Modus Ponens,

(Gen) Generalization.

Frege, in *Begrifsschrift*, also uses a second version (Gen\*) of Generalization that allows for the inference

$$\begin{aligned} \vdash \mathcal{C} \supset A(x) \\ \vdash \mathcal{C} \supset \forall x A(x) \end{aligned}$$

where the free variable does not occur in .

This rule, however, is readily derivable by adding an additional axiom, to wit,

$$\begin{array}{ll} \mathsf{C} \ \mathsf{S} \end{array} \ \begin{array}{l} \vdash \forall x [\mathsf{C} \supset A(x)] \supset [\mathsf{C} \supset \forall x A(x)] \end{array} \end{array}$$

by using (Gen) on the premiss of (Gen\*) to get

$$\vdash \forall x [C \supset A(x)]$$

and from this and (\$), MP yields

$$\vdash C \supset \forall x A(x).$$

A demonstration in the system will now start of with, in leaf positions, suitable closures of some of the nine axioms — ten if we opt to include the axiom (\$) — and then Modus Ponens and axiom L9 will gives us desired substitution instances of the axioms.35

So we are presented with an assertion of an empirical fact E, to which will be attached a chain of logical inferences. For Frege, as we saw in the previous section, the general logical inferences are those that can be carried out for his ideography, while using the formal apparatus displayed in the *Begrifsschrift*, without drawing either on facts or on scientifc general laws that pertain to a special science such as arithmetic, biology, or astronomy.

So we are given a "demonstration tree" for our theorem ⊢ where in top "leaf" position we may fnd:

<sup>35</sup> Frege, of course, uses a rule of substitution in BS without formulating it explicitly. It is a highly non-trivial task of giving exact substitution rules and today formulations in terms of axioms plus substitution rule have given way to formulations in terms of axiom-schemata. Church (1956b, p. 289–270) carefully reports the vicissitudes of the substitution rule.


We then continue downwards from these "leaves" using

(MP) Modus Ponens,

(Gen) Generalization.

We now eliminate *Einzelthatsachen* from the demonstration-tree by replacing each formula that occurs in by the -transformed formula ⊃ .

Thus, in general, the -transformed tree " ⊃" , will no longer be a BS derivation. Our task, accordingly, is now to restore BS deducibility at each place.

We begin with the leaf formulae in the tree . There are two possibilities: we may have the empirical fact , or we may have an instance of a general, topic-neutral logical law, that is, a BS *axiom*.

*Case i:* The formula has been replaced by ⊃ , which can be derived as follows:


Accordingly ⊢ ⊃ has been demonstrated from topic-neutral logical laws and is thus an analytic theorem ("truth" in Frege's terminology).36

*Case ii:* When is one of the a logical laws L1–L9, or (\$), its -transform ⊢ ⊃ is BS demonstrable as follows:

1. ⊢ is a BS axiom 2. ⊢ ⊃ ( ⊃ ) L1 3. ⊢ ⊃ by MP on 1 and 2

Thus the two leaf-positions in the demonstration are general, topic-neutral logical laws, and the judgement ⊢ ⊃ is analytic in Frege's sense.

In order to cope with inferences further down the "demonstration tree" we consider an instance of *modus ponens*:

$$\frac{\vdash (A \supset B) \qquad \vdash A}{\vdash B}$$

and assume that we have demonstrations for the two transformed premises:

⊢ ⊃

<sup>36</sup> Frege's demonstration of this is found as Theorem 27 on p. 43 of the *Begrifsschrift*.

Frege's Anticipation of the Deduction Theorem 77

and

$$\vdash E \supset (A \supset B).$$

From these we must give a demonstration of ⊢ ⊃ using only topic neutral laws and inferences.

However,

$$\vdash (E \supset (A \supset B)) \supset ((E \supset A) \supset (E \supset B))$$

is an instance of BS axiom L2, whence by Modus Ponens,

$$\vdash (E \supset A) \supset (E \supset B),$$

and one more application of Modus Ponens yields the desired theorem

$$\vdash E \implies B \dots$$

Finally we address the inference rule of Generalization. We are given a demonstration of a theorem

⊢ ⊃ (), where does not occur in .

By Generalization

$$\vdash \forall x (E \supset A(x))$$

but

$$\vdash \forall \mathbf{x} (E \supset A(\mathbf{x})) \supset (E \supset \forall \mathbf{x} A(\mathbf{x})) $$

is an instance of axiom (\$) and by a further Modus Ponens

$$\vdash E \supset \forall x A(x)$$

we are done.

My discerning readers will, no doubt, already have recognized the inspiration provided by Paul Bernays' elegant (1934a) demonstration of the Deduction Theorem: Frege's hypothetically asserted fact ⊢ here takes the place of the *assumption* formula in Bernays' demonstration.

A sequence of applications of this little result will yield the required implicational formula that will be analytic in Frege's sense if the original train of inferences ends in only everywhere applicable, topic-neutral laws.

An alternative treatment is needed for the case where one does not wish to avail oneself of the additional axiom (\$), but prefers to remain with Frege's third rule of inference (Gen\*):

$$\frac{\vdash \cdot \gets C \supset A(\mathbf{x})}{\vdash \cdot \gets C \supset \forall x A(\mathbf{x})}$$

where the free variable does not occur in . In order to complete the demonstration we accordingly assume that the judgement

$$(\*)\qquad\qquad\qquad\qquad\vdash E\supset(C\supset A(x))$$

$$\vdash E \supset (C \supset \forall \! x A(\text{x})).$$

The trick needed here is well known from the treatments in Bernays (1934) and in Kleene (1951); we note that

$$(\#) \newline \qquad \vdash P \supset (C \supset R) \leftrightarrow (P \not\& \ C) \supset R$$

and so one reasons, from (∗) to

$$\left(\ast\ast\right) \qquad\qquad\qquad\qquad\vdash E \ \&\ C \supset A(\times)$$

and from here by (Gen\*) to

$$\vdash \begin{pmatrix} \ast \ast \ast \end{pmatrix} \qquad \vdash E \ \& \ C \supset \ \forall \imath \! x A(\imath) \,.$$

An application of (#) in the opposite direction yields the required conclusion

$$\vdash E \supset (C \supset \forall \! x A(\mathbf{x})) .$$

Alas, this rather pleasing slight-of-hand, which also I used already in my treatment of the Deduction Theorem in (1983b), is not open to us here, because the conjunction & is not a primitive sign in the ideography of Frege's *Begrifsschrift* that uses only and ⊃. But

$$(E \& \ C \leftrightarrow \neg(\neg E \lor \neg C) \leftrightarrow \neg(E \supset \neg C),$$

The *Begrifsschrift* is semantically complete in its propositional part, whence the treatment can be carried out in the ideographical system of the *Begrifsschrift*. To do so, however, was not an enticing prospect. Going from (∗) to (∗∗∗) using in (∗∗) not & , but an ideographical rendering of ¬( ⊃ ¬), as middle formula, using only Frege's axioms and rules, would to the modern reader rather hide and complicate what is going on rather than illuminate it. Accordingly, above I instead opted for the easy way out, adding the (\$) axiom, thereby obviating awkward derivational work inside the *Begrifsschrift* ideography.37 With this my little exercise in Fregean proof theory and on the philosophy behind it has come to an end. The projected Part 2 of the running commentary, with the title

*An "objective order of demonstrations"?*

and comprising discussion of Frege, Bolzano, Port-Royal, Thomas Aquinas, and Aristotle, shall have to wait for another occasion.

<sup>37</sup> Here I am very much indebted to Dr. Joan Bertran-San Millán, of the Czech Academy of Science in Prague, who, with his intimate knowledge of the *Begrifsschrift* and its details, drew my attention to the fact that Frege does point out precisely these derivational steps in §11, page 22. I, of course, now suggest that he did this with the express purpose of justifying (Gen\*) by reducing it to (Gen 1), along the lines that also I used above. He generously shared his meticulous derivations in the *Begrifsscrhift*-like {¬, ⊃ } fragment of propositional calculus with me, and they are now reproduced in the appendix.

#### **Appendix**

This appendix is devoted to a demonstration that the propositions

$$(C \supset (B \supset A)) \quad \text{and} \quad \neg(C \supset \neg B) \supset A,$$

are equivalent in the *Begrifsschrift*. For ease of exposition, and of typing, I use the {¬, ⊃} fragment of propositional calculus that corresponds to the *Begrifsschrift* system.

The proposition ¬( ⊃ ¬) ⊃ is, of course, & ⊃ in *Begrifsschrift* disguise.

These derivations are needed for a full treatment of the second of the two rules for the universal quantifer and were designed by Dr. Joan Bertran-San Millán, of the Czech Academy of Science, Prague. I am indebted to him for pointing out Frege's use of the equivalence in BS, §11, and for sharing with me his meticulous Begrifsschrift-like derivations in the {¬, ⊃} fragment.

$$(\mathbf{Theorem} \ (\mathbf{i}) \quad \vdash (D \supset (C \supset (B \supset A))) \supset (D \supset (C \supset (\neg A \supset \neg B)))$$

Demonstration:

$$\begin{array}{llll} 1. \vdash (B \supset A) \supset ((C \supset B) \supset (C \supset A)) & \text{BS Thm. 5.} \\ 2. \vdash [(C \supset (B \supset A)) \supset (C \supset (\neg A \supset \neg B))] & \supset [(D \supset (C \supset (B \supset A)) \supset (D \supset A)) \supset (D \supset A)] \\ (C \supset (\neg A \supset \neg B))) & \text{Subst. 1. } \subset C \supset (B \supset A) / B, C \supset (\neg A \supset \neg B) / A, D / C \\ 3. \vdash (C \supset (B \supset A)) \supset (C \supset (\neg A \supset \neg B)) & \text{BS Thm. 29} \\ 4. \vdash (D \supset (C \supset (B \supset A))) \supset (D \supset (C \supset (\neg A \supset \neg B))) & \text{MP on 2 and 3} \end{array}$$

#### **Theorem (ii)** ⊢ (¬ ⊃ ¬) ⊃ ( ⊃ )

Demonstration:

1. ⊢ ( ⊃ ) ⊃ ( ( ⊃ ) ⊃ ( ⊃ )) BS Thm 9 2. ⊢ ( (¬ ⊃ ¬) ⊃ (¬¬ ⊃ )) ⊃ ( ( (¬¬ ⊃ ) ⊃ ( ⊃ )) ⊃ ( (¬ ⊃ ¬) ⊃ ( ⊃ ))) Subst. 1: (¬ ⊃ ¬)/, (¬¬ ⊃ )/, ( ⊃ )/ 3. ⊢ ( (¬ ⊃ ) ⊃ (¬ ⊃ ¬¬)) ⊃ ( (¬ ⊃ ) ⊃ (¬ ⊃ )) BS Thm 32 4. ⊢ ( (¬ ⊃ ¬) ⊃ (¬¬⊃ ¬¬)) ⊃ ( (¬ ⊃ ¬) ⊃ (¬¬⊃ )) Subst. in 3: ¬/ 5. ⊢ ( ⊃ ) ⊃ (¬ ⊃ ¬) BS Axiom L4 6. ⊢ (¬ ⊃ ¬) ⊃ (¬¬ ⊃ ¬¬) Subst. in 5: ¬/, ¬/ 7. ⊢ ( (¬ ⊃ ¬) ⊃ (¬¬ ⊃ )) MP on 4 and 6 8. ⊢ ( (¬¬ ⊃ ) ⊃ ( ⊃ )) ⊃ ( (¬ ⊃ ¬) ⊃ ( ⊃ )) MP on 2 and 7 9. ⊢ ( ⊃ ) ⊃ ( ⊃ ) ⊃ ( ⊃ )) BS Thm 9 10. ⊢ ( ⊃ ¬¬) ⊃ ( (¬¬ ⊃ ) ⊃ ( ⊃ )) Subst. 9: /, ¬¬/, / 11. ⊢ ⊃ ¬¬ BS Axiom L6 12. ⊢ ( (¬¬ ⊃ ) ⊃ ( ⊃ )) MP on 10 and 11 13. ⊢ (¬ ⊃ ¬) ⊃ ( ⊃ ) MP on 8 and 12

$$\text{4. } \vdash (D \supset (C \supset (\neg B \supset \neg A))) \supset (D \supset (C \supset (A \supset B))) \newline \qquad \qquad \text{MP on 2 and 3}$$

$$\begin{array}{llll} \text{1.} & (D \supset (C \supset (\neg B \supset \neg A))) \supset (D \supset (C \supset (A \supset B))) & \text{Theorem (iii)} \\ \text{2.} & (\neg((\neg(C \supset \neg B) \supset A)) \supset (C \supset ((\neg A \supset \neg B)))) \supset ((\neg(C \supset \neg B) \supset A)) \supset (C \supset \neg B) \\ & (B \supset A)) & \text{Subt. in 1:} \ (\neg(C \supset \neg B) \supset A)/D, A/B, B/A \\ \text{3.} & (D \supset (C \supset (B \supset A))) \supset (D \supset (B \supset (C \supset A))) & \text{BS Thm 12} \\ \text{4.} & (\neg((\neg(C \supset \neg B) \supset A) \supset A) \supset (\neg A \supset (C \supset \neg B))) \supset ((\neg(C \supset \neg B) \supset A) \supset (C \supset \neg B) \to A) \\ & (\neg A \supset \neg B)) & \text{Subt. in 3:} \ (\neg(C \supset \neg B) \supset A)/D, \neg A/C, C/B, \neg B/A \\ \text{5.} & (\neg(B \supset A) \supset (\neg A \supset B) \supset A) \supset (\neg A \supset (C \supset \neg B)) & \text{Subt. in 5:} \ C \supset \neg B/B \\ \text{6.} & (\neg(C \subset \neg B) \supset A) \supset (\neg A \supset (C \supset \neg B)) & \text{Subt. in 5:} \ C \supset \neg B/B \\ \text{7.} & (\neg(C \subset \neg B) \supset A) \supset (C \supset (\neg A \supset \neg B)) & \text{MP or 4 and 6} \\ \end{array}$$

#### **References**


*Schwerin (GDR), September 10–14, 1984*. Ed. by G. Wechsung. Berlin: Akademie-Verlag, 182–188.


*27–30 March 2019*. Ed. by T. Piecha and P. Schroeder-Heister. University of Tübingen, 237–252. url: http://dx.doi.org/10.15496/publikation-35319.


Tichý, P. (1988a). *Freges Foundations of Logic*. Berlin: De Gruyter.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Frege's Class Theory and the Logic of Sets**

Neil Tennant

**Abstract** We compare Fregean theorizing about sets with the theorizing of an ontologically non-committal, natural-deduction based, inferentialist. The latter uses free Core logic, and confers meanings on logico-mathematical expressions by means of rules for introducing them in conclusions and eliminating them from major premises. Those expressions (such as the set-abstraction operator) that form singular terms have their rules framed so as to deal with *canonical identity statements* as their conclusions or major premises. We extend this treatment to *pasigraphs* as well, in the case of set theory. These are defned expressions (such as 'subset of', or 'power set of') that are treated as basic in the *lingua franca* of informal set theory. Employing pasigraphs in accordance with their own natural-deduction rules enables one to 'atomicize' rigorous mathematical reasoning.

#### **1 Introduction**

Our honoree Peter has had abiding and deep interests both in Frege's work in logic, and in proof-theoretic semantics, a feld in which he has played an important founding role. I thought it ftting, then, to combine a bit of both in this paper in his honor.

In his recent study Schroeder-Heister (2016), Peter's abstract reads as follows:

I present three open problems the discussion and solution of which I consider relevant for the further development of proof-theoretic semantics: (1) The nature of hypotheses and the problem of the appropriate format of proofs, (2) the problem of a satisfactory notion of proof-theoretic harmony, and (3) the problem of extending methods of proof-theoretic semantics beyond logic.

This study will address (3), by venturing beyond logic to set theory. In seeking to provide a *natural and free* logic of sets, we shall also have some things to say about

Neil Tennant

© The Author(s) 2024 85

Department of Philosophy, The Ohio State University, Columbus, OH, United States of America, e-mail: tennant.9@osu.edu

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_3

(1) and (2). The frst part of our journey will involve revisiting Frege, to examine why such a logic is called for, and how to set it up. We shall fnd that certain proof-theoretic constraints will make that 'it' — that logic — unique . . . or so the line of exposition and development on ofer here should lead one to believe.

The eventual goal, which will be reached by the end of this study, is to show how the logic of sets (which consists entirely of natural-deduction rules of inference) can be made *ontologically non-committal*. Its rules of inference will nevertheless be fully *meaning-conferring*. This observation applies not only to the central primitive notions — the variable-binding term-forming operator { | . . . . . .} for set abstraction, and the binary relation ∈ for membership — but also to all those ancillary expressions such as ⊆ ('subset of'), Ð ('union of'), ('power set of'), etc. The latter notions — though of course logically defnable in terms of the primitive notions — are so familiar and 'practically primitive' in the *lingua franca* of informally rigorous set theory that they call for a more focused rule-theoretic treatment. We shall call them *pasigraphs*, and furnish rules for them. Those rules will be meaning-conferring, but *still* incur no ontological commitments at all.

This means that we can furnish set theorists with a framework of logical rules for set-theoretic notions without committing them to an ontology. We can leave to them the job of specifying which sets exist outright, and which sets exist conditionally on the existence of which other sets.

Our foray in this study into the logic of sets is a protean, if rather ambitious, frst step in a more general and unifying study of both *natural deduction and truthmaker semantics for pasigraphs*. The motivating idea is that every pasigraph will have introduction and elimination rules in a system of natural deduction governing one's deductive reasoning to and from sentences with the pasigraph in question suitably dominant.1 In addition, every pasigraph will have (model-relative) verifcation- and falsifcation-rules for constructing logical truthmakers and falsitymakers of the kind described in Tennant (2018) and Tennant (2010). Such rules aford the pasigraphs what is essentially a proof-theoretic semantics. In the language of the pasigraphs, the notion of logical consequence will be defned in terms of how verifcations for premises may be transformed into verifcations for conclusions. The aim will then be to show how proofs in Classical Core Logic ℂ + aford 'quasi-'efective methods for carrying out such transformations of (model-relative) truthmakers. (The scare-quoted prefx will be able to be dropped in the constructive case where Core Logic ℂ afords all the transformations required.)

<sup>1</sup> 'Suitable dominance' is plain dominance in the case of sentence-forming operators such as connectives and quantifers. In the case of term-forming operators @, such as the set-abstraction operator { | Φ() }, the natural-deduction rules will govern inferences to and from 'canonical identity statements' of the form = @Φ(). We shall expand on this below.

#### **2 The natural and free logic of sets**

There can be an analytically valid logic of sets, even if sets themselves are not *logical* objects. For the purposes of this study, the words 'set' and 'class' will be treated as synonyms. No von Neumann–Bernays–Gödel distinction will be countenanced, according to which sets are those classes small enough to be able to belong to yet other sets or classes, whereas (proper) classes are too big to do so, even though they exist.

Frege is the natural starting point for our study. His legacy of complete formalization, both of his logical resources and of the proofs he provided for his results, is invaluable when it comes to considering exactly what the *logic* of sets really is. Two other Fregean themes are of great importance here too.


Frege, as is well known, plumped for the frst option in each of these cases. And as is also well known, his system sufered the disaster of Russell's paradox. That (in our view) was entirely owing to Frege's answer to question (1) — that every singular term denotes. His answer to question (2) — that we should allow for *Urelemente* threatens no inconsistency at all, and is well worth implementing in any universally applicable logic of sets that recognizes that some things are *not* sets, and that some sets can have non-sets as members.

It will be argued here that the disaster of Russell's paradox stemmed solely from the misguided choice of a 'logically perfect' language for theorizing about sets, regardless of whether one speaks only of pure sets or allows for the 'impurities' of *Urelemente* in their membership pedigrees.

It will also be argued that the logic of sets that emerges for the revisionary Fregean who adopts a free logic is optimally formulated in terms of introduction and elimination rules (in natural deduction) for the set-abstraction operator

$$\{x \mid \dots x \dots \}\text{.}$$

Such rules will be stated in due course. This pulls one from the set-theoretic frontiers of Zermelo (1908), back to Fregean origins. The usual story about set theory is one

<sup>2</sup> *Urelemente* — if one's theory permits them — are individuals in the domain of discourse that are *not* sets (or classes). Simple examples would be ordinary physical objects, such as Hilbert's beer mugs, chairs and tables; or, in more sophisticated vein, the fundamental particles of subatomic physics. Not all *Urelemente*, however, have to be concrete individuals. They can be abstract, without being sets (or classes). One could, for example, treat the *natural numbers* as *sui generis* mathematical (or *logical*) objects, not to be identifed with any 'set-theoretical surrogates' such as the fnite von Neumann ordinals. One could then 'build sets' on top of them, as Weyl sought to do.

of the logicist being utterly vanquished, and the transition being made to a purely *mathematical* (synthetic *a priori*, at best) theory of abstract objects known as pure sets, characterized (as had been the natural, rational and real numbers) by an appropriate efectively decidable set of axioms and axiom schemata. The main implication of the investigation that will unfold here is that this 'mathematization' of set theory by Zermelo and his followers can be regarded as overly precipitous. It abandoned too early, and too pessimistically, the logicist's aim of characterizing at least the *logic* of our talk about sets. This logic embodies just the constraints governing or constituting the *concept* of set, rather than the existential or ontological commitments of any particular set theories.

At the very least, Zermelo's set theory makes it impossible to deal with *Urelemente* alongside sets. This is because its Axiom of Extensionality identifes any two things that have no members. The empty set (which Zermeloan set theory says exists) has no members; and no *Urelement* can have any members. Thus every *Urelement is* the empty set. But no *Urelement* is a set. So there are no *Urelemente*. Zermelo can be talking only about (hereditarily) *pure* sets. And it would remain a mystery how his set theory can fnd application in our talk about 'the real world' of physical objects, which are the paradigm examples of *Urelemente*. Another such example would be the natural numbers taken as objects *sui generis*, as they are in Reverse Mathematics. These too are really *Urelemente*, a subtlety often overlooked.

There is a line to be drawn between what is logico-analytically valid in our theorizing about sets in general, and which of them have to be specifcally postulated, outright or conditionally, as existing. We shall learn that the natural-deduction theorist who is sympathetic to the pursuit of a logic of sets can make a distinctive contribution by taking a very careful look at what was going on in Frege's frst systematic stab at the problem. The *Core Logicist* can sharpen the tools Frege left us in a way that is interestingly and signifcantly short of total mathematizing surrender to the disaster that was Russell's paradox.

The Core Logicist is the theorist who follows the methodological maxim that rules of inference serving to fx the meanings of primitive logico-mathematical expressions have their natural niche in the constructive and relevant deductive reasoning characterized by Core Logic. Conceptual interconnections articulated by defnitional rules of inference are constructive and relevant. The aforementioned 'logic of sets' will be generated by using the rules of (free) Core Logic for the usual logical operators, along with well-chosen rules of natural deduction governing set-abstraction.

As explained in Tennant (2017), Core Logic, in its natural deduction formulation, has all its elimination rules in 'parallelized' form. Moreover, their major premises always *stand proud*, with no non-trivial proof-work above them. This ensures two important features: (i) all core natural deductions are in normal form; and (ii) they are also, in a naturally defnable sense, isomorphic to the corresponding sequent-calculus proofs. In Core Logic, sequent proofs use Refexivity as their only structural rule; and otherwise consist only of applications of Right rules and/or Left rules for the operators involved. So core sequent proofs are both cut-free and thinning-free. Right

rules in sequent calculus correspond to introduction rules in natural deduction; while Left rules correspond to elimination rules.

A simple example of a natural deduction and its corresponding sequent proof will serve to fx these ideas. Note how the step of ∧-Elimination labeled (2) (and with major premise ∧ ) is in parallelized form, and discharges the conjuncts and at their assumption occurrences.

*Natural Deduction Sequent Proof* () ∧ ¬ ∨ ¬ (3) ¬ (2) ⊥ (3) ¬ (2) <sup>⊥</sup> (3) <sup>⊥</sup> (2) <sup>⊥</sup> (1) ¬( ∧ ) : ¬, : : ¬, : ¬ ∨ ¬, , : ¬ ∨ ¬, ∧ : ¬ ∨ ¬ : ¬( ∧ )

#### **3 A look at some Fregean basics**

Consider this formal sentence in Frege's now archaic notation:

$$\mathcal{A} \sim \mathcal{C} \bullet \emptyset(\mathfrak{e})$$

Today it would be written

$$
\Delta \in \{ x \mid \Phi(x) \}.
$$

For Frege, Δ stood for an individual, and Φ for a frst-level concept. Frege stipulated in his *Grundgesetze* that the sentence of his displayed form above was to be co-referential3 with

Φ(Δ).

This would mean, for the modern inferentialist, that Frege would regard as logically or analytically valid the two inference rules4

$$\text{F1} \quad \frac{\Phi(t)}{t \in \{\boldsymbol{x} \mid \Phi(\boldsymbol{x})\}} \qquad \text{and} \qquad \text{F2} \quad \frac{t \in \{\boldsymbol{x} \mid \Phi(\boldsymbol{x})\}}{\Phi(t)}.$$

Here we use instead of Frege's Δ as a placeholder for singular terms. We shall do this throughout, when couching things in natural-deduction terms.

Frege wanted also to have his frst-order binary membership relation

⌢

explained 'für alle möglichen Gegenstände als Argumente'. The explanatory defnition he ofered was as follows (here, for ⌢):5

<sup>3</sup> The German term was 'gleichbedeutend' (Frege, 1893, §34, at p. 52). All English translations of material quoted from Frege are taken from Frege (2013).

<sup>4</sup> See, for example, the Appendix in Prawitz (1965).

<sup>5</sup> *Ibid.*, p. 53.

90 Neil Tennant

(Def.⌢)

We seek to render (Def.⌢) in notation we use today. In preparing to do so, we need to remind ourselves that any singular term of the form

> \\ , Φ()

was Frege's version of a defnite description ('the such that Φ()'), but with the strange twist — in fulfllment of Frege's strict self-imposed requirement that all well-formed singular terms should denote — that, should there *not* be exactly one Φ, the denotation of the displayed term is the class of all Φs. So, if Φ is an empty concept, then the denotation of the displayed term is the empty class; while if more than one object falls under the concept Φ, then the denotation is the class (or set) that they form.6 In contemporary notation (using iota as the modern defnite-description operator) we may render Frege's defnition of the displayed term as follows:

$$\chi \dot{\varepsilon} \, \Phi(\varepsilon) =\_{\mathcal{df}} \begin{cases} \iota \omega \Phi(\mathbf{x}) & \text{if } \exists \mathbf{x} \forall \mathbf{y} (\mathbf{x} = \mathbf{y} \leftrightarrow \Phi(\mathbf{y})); \\ \{\mathbf{x} \mid \Phi(\mathbf{x})\} & \text{otherwise.} \end{cases}$$

Let us now turn to the task of translating into modern notation Frege's defnition

$$\mathsf{hDef.}\frown) \qquad \qquad \qquad \qquad \|\mathsf{hD}\|\_{\mathfrak{q}} \triangleq \mathsf{hD}\_{\mathfrak{q}} \mathsf{hD}\_{\mathfrak{q}} \|\_{\mathfrak{q}} \qquad \qquad (\sim \mathsf{p} \mathsf{q}) \mathsf{hD}\_{\mathfrak{q}} \|\_{\mathfrak{q}} \qquad \qquad (\sim \mathsf{p} \mathsf{q}) \mathsf{hD}\_{\mathfrak{q}} \mathsf{hD}\_{\mathfrak{q}} \|\_{\mathfrak{q}} \mathsf{hD}\_{\mathfrak{q}} \mathsf{hD}\_{\mathfrak{q}} \to 0 \qquad \qquad \mathsf{p} \mathsf{q} \mathsf{hD}\_{\mathfrak{q}} \mathsf{hD}\_{\mathfrak{q}} \to 0 \,\mathsf{q} \,\, \mathsf{P} \mathsf{q} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \mathsf{T} \,\, \mathsf{A}\_{\mathfrak{q}} \,\, \, \, \$$

Remember that this defnitional identity is an identity between the truth-value of the left-hand side: ,

$$\dot{\alpha}(.\dots a \dots)$$

and the truth-value of the right-hand side:

$$a \frown \mu.$$

The parenthetically enclosed material on the left-hand side is Frege's way of rendering the following second-order sentence in modern notation:

$$\neg\forall G(\mu=\{x \mid G(x)\} \to \neg G(a) = \alpha) \,.$$

This is classically equivalent to

$$
\exists G \neg (u = \{x \mid G(x)\} \to \neg G(a) = \alpha),
$$

<sup>6</sup> See Frege (1893), §11. As Roy Cook puts it in his Appendix in Frege (2013) (at p. A-19),

<sup>[. . .] &#</sup>x27;\\ , Φ( )' denotes the unique object that is mapped to the True by the concept named by 'Φ( )', if there is such, and denotes the object named by ' , Φ( )' otherwise.

which in turn is classically equivalent to

$$\exists G \left( \mu = \{ \mathbf{x} \mid G(\mathbf{x}) \} \land G(a) = \alpha \right).$$

So (Def. ⌢) is asserting an identity between the truth-value of ⌢ and the truth-value denoted by the Fregean defnite description (here rendered in mixed notation)

$$\chi \dot{\alpha} \left( \exists G (\mu = \{ \mathbf{x} \mid G(\mathbf{x}) \} \land G(a) = \alpha) \right).$$

For this to be the True (*das Wahre*), the term has to denote an object — the extension (*Werthverlauf*) of some appropriate concept . Thus the innocent-looking frst-order binary predication

⌢.

commits one to the existence of a denotation for the term . This is of a piece with Frege's insistence that a 'logically perfect' language has all its singular terms denoting.

#### **4 Moving on from Frege**

Suppose we abandon that very imperfect conception of logical perfection, and work with a free logic. Let ∃! be the familiar abbreviation for ∃ =, for any singular term . Free logic has the Rule of Atomic Denotation for atomic predicates :7

$$\text{RAD} \quad \frac{A(\dots t \dots)}{\exists !t};$$

and expresses the Refexivity of Identity by the rule

$$\text{Ref} = \begin{array}{c} \exists!t \\ t = t \end{array} \dots$$

Note that these are, respectively, logical *weakenings* of the zero-premise rules

∃! and =

to which Frege, with his 'logically perfect' language, was already committed. So one could invite Frege to recognize the validity of these weakened rules (along with those about to be stated) even though one is now working in a logically 'imperfect' language. In the transition to free logic, the rule of Substitutivity of Identicals remains unchanged:

<sup>7</sup> See Tennant (1978), Ch. 7 §10 for a detailed treatment of free logic and the rules for set theory that we are presenting again here. The rule RAD captures the Russellian requirement that an atomic proposition is true only if all its singular terms denote. Of course it is required in addition that the denotations stand in the relation represented by the predicate of the atomic proposition concerned. The rule RAD captures just the existential presuppositions concerning the singular terms involved.

$$\text{Subb.=} \quad \frac{\varphi}{\nu} \quad \frac{t=\mu}{\nu} \quad \frac{\text{where } \psi \text{ results from } \varphi \text{ by at least one}}{\text{substitution of an occurrence of } t \text{ for one}}$$
 
$$\text{of } \mu \text{ or of an occurrence of } \mu \text{ for one of } t.$$

The natural-deduction rules about to be stated aim to characterize no more than the *logic* of talk about sets. To this end, one needs to clarify the interrelationships among sethood (i.e., existence of a set), set-abstraction, predication, and membership. This is a theoretical or foundational aim that the natural-deduction theorist can *share* with the Fregean. This study will investigate how the two theorists can pursue that aim, and whether one of them can claim to have achieved it in a more satisfactory fashion. Our answer, of course, will be that the natural-deduction theorist, with her free logic, is the winner in this comparison.

The rules that the two theorists can formulate are for a language with the set-term forming operator { | . . . . . .} primitive. Also primitive is the two-place predicate ∈ of membership. In due course another primitive (but one-place) predicate will be added to the language (for 'is a set'). The working assumption will be that the same formal-linguistic resources are available to both theorists (Fregean and naturaldeduction), so that the comparison of their approaches will be based on the primitive logical rules that they postulate for the same language. Bear in mind that an axiom is here construed as a zero-premise rule.

#### **4.1 Natural-deduction rules for pure sets**

**Some notational preliminaries.** Where is a closed term (which of course could be a parameter) and Φ is a formula with just the variable free, we denote by Φ the result of replacing every free occurrence of in Φ with an occurrence of . Where Φ is a sentence involving at least one occurrence of the parameter , and with none of those occurrences within the scope of a variable-binding operator applied to the variable , we denote by Φ the result of replacing every occurrence of in Φ with an occurrence of . Note that every such occurrence of in Φ is free.

The rule of introduction (in free logic) for the variable-binding abstraction operator that forms set-terms from predicates is

{ }I () ∃! , () Φ | {z } . . . ∈ ∃! () ∈ . . . Φ () = { | Φ} , where is parametric.8

Note how the canonical conclusion

<sup>8</sup> Note that since ∈ is an *atomic* binary predicate, the assumption ∈ in the rightmost subordinate proof implies ∃! (by the rule RAD). So is it not necessary to have ∃! as a further dischargeable assumption in that subordinate proof.

$$t = \{ x \mid \Phi \} $$

of { }I has on its left-hand side, as a placeholder for *any singular term whatsoever*, including the *parameters* (conventionally , , , . . .) that can be used for reasoning involving existentials and universals. On the right-hand side of the identity is a setabstraction term, formed by means of a *dominant* occurrence of the variable-binding abstraction operator { | . . . . . .}. This operator may be applied to a formula Φ to form the set-abstraction term { | Φ} if, but only if, the variable has a free occurrence in Φ.

The elimination rules corresponding to the introduction rule stated above for { } are the following three, each one employing the canonical identity statement

$$\mathfrak{t} = \{ \mathfrak{x} \mid \mathfrak{Phi} \}$$

as its major premise (to the left, immediately above the inference stroke). The minor premises (or subproofs) of the frst and third rules correspond, respectively, to the frst and third immediate subproofs of the introduction rule. This is a convincing sign that the elimination rules are in harmony with the introduction rule that begets them. Bear in mind that the major premise = { | Φ} stands proud in any application of an elimination rule for { }.

$$\begin{array}{ccccc} \{\,\,\} \to\_{1} & t = \{x \mid \,\Phi\} & \exists !\boldsymbol{\nu} & \Phi^{\boldsymbol{x}}\_{\boldsymbol{\nu}}\\ \hline \{\,\,\} \to\_{2} & t \in \boldsymbol{t} & \\\\ \{\,\,\} \to\_{2} & \frac{t = \{x \mid \,\Phi\}}{\exists !\boldsymbol{t}} & \\\\ \{\,\,\} \to\_{3} & & \begin{array}{c} \overline{\mathtt{op}}^{\boldsymbol{x}}\_{\boldsymbol{\nu}} \ (i) \\\\ \hline \end{array} \\\\ \underline{\boldsymbol{t}} & \underline{\boldsymbol{t}} = \{x \mid \,\Phi\} & \boldsymbol{\nu} \in \boldsymbol{t} & \boldsymbol{\theta} \\ \hline \end{array} (i)$$

The rule { }E<sup>1</sup> has an atomic conclusion, so it is not necessary to parallelize it for the purposes of Core Logic. This is because no atomic conclusion can feature as the major premise of any elimination. The rule { }E<sup>2</sup> is a special case of the Rule of Atomic Denotation. The rule { }E<sup>3</sup> needs to be parallelized, in order to avoid having non-trivial proof-work above Φ should it happen to stand as the major premise of an elimination.9

The introduction rule and the elimination rules just stated for { } are, as just intimated, in harmony. Harmony requires that there at least be reduction procedures that will eliminate from proofs 'maximal sentence occurrences' — conclusions of introductions that are also major premises of eliminations. (Whether harmony requires *more* than this is a more complicated issue that we shall not broach here.) What follow now are the three reduction procedures that are required for harmony (one for each

<sup>9</sup> This risk would be incurred if { }E<sup>3</sup> were to be stated in the serial form = { | Φ} ∈ Φ .

elimination rule). We state them in the notation of Tennant (2017)10 rather than in our original format in Tennant (1978). We shall refrain from providing reduction procedures for any other Introduction-Elimination pairs, since (as a referee was kind enough to observe) they 'write themselves'.

 Δ<sup>1</sup> , () ∃! , () Φ | {z } Π1 ∈ Δ2 Π2 ∃! Δ<sup>3</sup> , () ∈ | {z } Π3 Φ () = { | Φ} , = { | Φ} Γ1 Σ1 ∃! Γ2 Σ2 Φ { }E<sup>1</sup> ∈ = Γ1 Σ1 ∃! , Γ2 Σ2 Φ , Δ<sup>1</sup> , ∃! , Φ | {z } Π1 ∈ Δ<sup>1</sup> , () ∃! , () Φ | {z } Π1 ∈ Δ2 Π2 ∃! Δ<sup>3</sup> , () ∈ | {z } Π3 Φ () = { | Φ} , = { | Φ} { }E<sup>2</sup> ∃! = Δ2 Π2 ∃! Δ<sup>1</sup> , () ∃! , () Φ | {z } Π1 ∈ Δ2 Π2 ∃! Δ<sup>3</sup> , () ∈ | {z } Π3 Φ () = { | Φ} , = { | Φ} Γ1 Σ1 ∈ Γ<sup>2</sup> , () Φ | {z } Σ2 { }E<sup>3</sup> = Γ1 Σ1 ∈ , Δ<sup>3</sup> , ∈ | {z } Π3 Φ , Γ<sup>2</sup> , Φ | {z } Σ2 

<sup>10</sup> Note that in Core Logic the reduction procedures are used only in proving the *admissibility* of 'Cut with potential epistemic gain'. All core proofs are in normal form. Reductions therefore do not eliminate maximal occurrences from core proofs, because there aren't any such occurrences in core proofs. Reductions come into the picture only when core proofs are 'strung together', with the conclusion of one core proof occurring as a premise of another. The applicable reductions, when carried out by the core logician, will then furnish a core proof of some subsequent of the 'target sequent' that the follower of Gentzen would be happy to prove by stringing proofs together and repeatedly applying his structural rule of cut. For the core logician, cut is not a rule *of* or *in* the system. Nor is thinning.

A degenerate application of { }I ensures that everything is the set of its members:

**Theorem 4.1** *If exists, then is the set of all things bearing* ∈ *to .*

$$\text{Proof}\qquad\frac{\text{(l)}\quad\overline{a\in t}}{t=\{x\mid x\in t\}}\stackrel{\text{(1)}}{\text{(1)}\quad\{\text{\textquotedblleft}l\end{\text{?}}}\tag{1}$$

So one needs to bear in mind that, with { }I in its present form, the universe of discourse is presumed to consist only of sets and to have no *Urelemente*. This is why the title of this subsection indicates that our rules are for theorizing about (hereditarily) *pure* sets.11

#### **4.2 Natural-deduction rules for impure sets**

If *Urelemente are* to be countenanced — meaning that one has to allow for the possibility of (hereditarily) *impure* sets — then the second premise (currently ∃!) of the rule { }I will have to take the form , to be interpreted as ' is a set'. By the Rule of Atomic Denotation this will also secure ∃!, since is an atomic predicate.

The -modifed introduction rule for { } is as follows.

$$\{\{\}\}\text{I} \qquad \begin{array}{c} (i) \underbrace{\overline{\exists!a}}\_{}, \overline{\Phi\_a^{\mathbf{x}}}\_{a} \quad (i) \\ \vdots \\ \vdots \\ a \in t \qquad \text{St} \qquad \Phi\_a^{\mathbf{x}} \quad (i) \\ \hline t = \{\mathbf{x} \mid \Phi\} \end{array} (ii)$$

As a result of this modifcation, a corresponding change is needed only to { }E<sup>2</sup> among the elimination rules:

$$\{\ \ \rangle \to\_2 \quad \frac{t = \{x \mid \Phi\}}{St} \dots$$

Note that the -modifed { }E<sup>2</sup> is *not* an instance of the Rule of Atomic Denotation. With -modifcation the rule { }E<sup>2</sup> has, as it were, 'come into its own' as an elimination rule making its own distinctive contribution.

The -modifed introduction rule for { } is in harmony with its corresponding elimination rules.

#### **4.3 Single-barreled vs. double-barreled abstraction**

The kind of introduction rule being considered here (with or without -modifcation) will be called a *single-barreled* rule, because of the single occurrence of the set-

<sup>11</sup> The introduction and elimination rules just stated were frst given in Tennant (1978).

abstraction operator in the rule's conclusion, dominant on the right-hand side. The intended contrast is with a *double-barreled* rule, such as Frege's ill-fated12

$$\text{(Va)} \quad \frac{\forall x(\Phi x \leftrightarrow \Psi x)}{\{x \mid \Phi x\} = \{x \mid \Psi x\}},$$

or even its 'free-logical' modifcation

$$(\text{fVa})\quad\frac{\forall \mathfrak{x}(\Phi\mathfrak{x}\leftrightarrow\Psi\mathfrak{x})}{\{\mathfrak{x}\mid\Phi\mathfrak{x}\}=\{\mathfrak{x}\mid\Psi\mathfrak{x}\}}.$$

Such rules venture to 'introduce' (in the conclusion) *two* occurrences of the setabstraction operator, one on each side of the identity sign. This represents a prima-facie limitation on the intended range of identifcations aforded by the rule, since it does not involve the more general placeholder that would be replaceable in licit applications by names or parameters in addition to set-abstraction terms.

If one is countenancing the possibility of *Urelemente*, and accordingly using the predicate for '. . .is a set', then Theorem 4.1 will read 'If is a set, then is the set of all things bearing ∈ to .' And in the proof of this one will simply substitute for ∃! in the proof of Theorem 4.1. This is worth stating as a separate theorem.

#### **Theorem 4.2 (for the -modifed rule** { }**I)** = { | ∈ } *. Proof* (1) ∈ (1) ∈ (1) { }I, -modifed form. = { | ∈ } □

The introduction and elimination rules are *ontologically neutral* — they characterize only the *logic* of one's talk about set abstraction, membership, and predication, not one's theory about what sets actually exist. Depending on a mutually agreed decision to confne one's theorizing to pure sets, or, alternatively, to allow for impure sets, *Frege would have conceded the validity of the corresponding natural-deduction rules — especially* the validity of the *elimination* rules for the set-term forming operator.

<sup>12</sup> The rules stated here as Va (on this page) and Vb (on page 102) are respectively the inferential equivalents of Frege's own Va and Vb on p. 69 of the *Grundgesetze*, in §53. From our free-logical perspective it is Va that is disastrous, in permitting easy derivation of Russell's paradox. Ironically, Frege himself, in his *Nachwort*, presented a regimentation of Russell's reasoning in the formalism of the *Grundgesetze*, which ended up laying the blame for Russell's paradox on Vb. (I am grateful here to Peter, for drawing this to my attention.) I rather suspect, though, that if one were to regiment *in natural dedunction* Frege's own *Nachwort* reconstruction, within his own formal system, of Russell's reasoning, we would fnd that Frege was blaming the 'wrong half' of Basic Law V. Vindicating this suspicion, however, is beyond the scope of the present paper.

#### **5 Results provable by the Fregean or by the natural-deduction theorist**

In what follows, the *Theorems* stated (at least, up to and including Theorem 8.3) are results to the efect that such-and-such rules of the natural-deduction theorist allow one to derive so-and-so principle of the Fregean; and the *Lemmas* (at least, up to and including Lemma 8.9) are to the efect that so-and-so principles of the Fregean allow one to derive such-and-such rule of the natural-deduction theorist. The convenient abbreviation

$$R\_1, \ldots, R\_n \implies R$$

will be used to state these results. The rules 1, . . . , will be primitive for one of the theorists, and the rule will be primitive or derivable for the other. Any other rules not mentioned, but which are used in the derivation, will be ones that are primitive for both of them (such as, for example, the Rule of Atomic Denotation). The aim, in the frst instance, is to see whether the two theoretical approaches (roughly: Fregean vs. natural-deduction) are essentially equivalent.

The reader should be aware that we shall apply { }E<sup>3</sup> in its serial form (see footnote 9) rather than its parallelized form whenever it is convenient to do so. A good example of this is at the fnal step of the formal proof that follows, of Theorem 5.1.

**Theorem 5.1** { }E<sup>3</sup> ⇒ F2*.*

$$Proof \qquad \frac{\frac{t \in \{x \mid \Phi(x)\}}{\exists!\{x \mid \Phi(x)\}} \text{RAD}}{\frac{\{x \mid \Phi(x)\} = \{x \mid \Phi(x)\}}{\Phi(t)} \quad \text{Ref} = \underbrace{\{x \mid \Phi(x)\}}\_{\Phi(t)} \quad \text{( $\cdot$ )} \\ \text{E}\_{\beta}$$

If Frege had been instructed on how to construct proofs using rules of inference in the manner employed here, he would have derived the rule

{ }E<sup>1</sup> = { | Φ} ∃! Φ ∈

in the following even stronger form (by not availing himself of the premise ∃!).

$$\begin{array}{ll} \textbf{Lemma 5.2 } \mathrm{Fl} \Rightarrow \{\,\,\} \mathrm{E\_{l}}. \\\\ \begin{array}{ll} \Phi\_{\mathrm{v}}^{\times} & \\ \hline \hline \upsilon \in \{\mathsf{x} \mid \mathsf{@}\} \end{array} \mathrm{Fl} & t = \{\mathsf{x} \mid \mathsf{@}\} \\ \hline \upsilon \in t \end{array} \end{array} \qquad \begin{array}{ll} \Box \begin{array}{ll} \Gamma \\ \hline \hline \end{array} \end{array} \qquad \begin{array}{ll} \Box \begin{array}{ll} \Gamma \\ \hline \hline \end{array} \end{array} \qquad \Box \begin{array}{ll} \Box \upsilon \end{array} \begin{array}{ll} \Gamma \\ \hline \hline \end{array} \end{array}$$

Bear in mind: one is talking in this instance of the Frege who is committed to each singular term's enjoying a denotation. That is why he would have eschewed the free logician's needed extra premise ∃! in the rule { }E1.

Frege would also have been able to derive the rule { }E3:

$$\{\ \ \rangle \to\_3 \quad \frac{t = \{x \mid \Phi\} \qquad v \in t}{\Phi\_v^x}$$

#### **Lemma 5.3** F2 ⇒ { }E3*.* ∈ = { | Φ}

 ∈ { | Φ} F2 Φ 

#### *Proof*

Suppose, for the sake of some imaginary and counterfactual speculation, that Frege could have been induced to consider the possibility (in advance of Russell's paradox) that certain kinds of singular terms might not always be secured denotations. He could have been invited to consider the possibility that the extensions of certain concepts were impossible to comprehend as individual, completed entities. Using Kripke's metaphor: such an 'extension' (because of some peculiarity of the concept whose extension it would erroneously be supposed to be) might resist being captured within the limited embrace of any 'intellectual lasso' trying to draw together all its members.

This speculative Frege, one presumes, would have recognized the free-logical validity of the rule { }E1, and would have remained content — with the free logician's concurrence — with the derivation which uses the rule F2 and which was supplied on his behalf in Lemma 5.3 for the rule { }E3. But he would have modifed his erstwhile rule F1 to become fF1 ('fF' here for 'free-logical Frege'), furnished with the two existential presuppositions that are needed in the free-logical context:

$$\text{fFl} \quad \frac{\Phi(t) \quad \exists! t \quad \exists! \{\boldsymbol{x} \mid \Phi(\boldsymbol{x})\}}{t \in \{\boldsymbol{x} \mid \Phi(\boldsymbol{x})\}}$$

**Lemma 5.4** fF1 ⇒ { }E1*.*

$$Proof \qquad \frac{\Phi(\nu) \quad \exists! \nu \quad \frac{t = \{x \mid \Phi\}}{\exists! \{x \mid \Phi\}}}{\nu \in \{x \mid \Phi\} \quad \forall!}\_{\text{IF1}} \quad \begin{array}{l} t = \{x \mid \Phi\} \\ \hline \blacksquare \{x \mid \Phi\} \quad \square \qquad t = \{x \mid \Phi\} \end{array} \tag{7}$$

The natural-deduction theorist can return the favor, with the following converse. **Theorem 5.5** { }E<sup>1</sup> ⇒ fF1*.*

#### *Proof*

$$\begin{array}{llll} \text{Ell\{x \mid \Phi(\mathbf{x})\}, \text{ i.e.} & \begin{array}{l} a = \{\mathbf{x} \mid \Phi(\mathbf{x})\} \\ t \in a \end{array} & \begin{array}{l} \begin{array}{l} a = \{\mathbf{x} \mid \Phi(\mathbf{x})\} \\ t \in a \end{array} & \begin{array}{l} \begin{array}{l} \Sigma \mid t \end{array} \end{array} \begin{array}{l} \Phi(\mathbf{t}) \\ a = \{\mathbf{x} \mid \Phi(\mathbf{x})\} \end{array} \end{array} \end{array} & \begin{array}{l} \begin{array}{l} \begin{array}{l} \mathbf{x} \mid \Phi(\mathbf{x}) \end{array} \end{array} \end{array}$$

#### **6 Russell's paradox**

Here we shall be scrupulous in using the 'ofcial', parallelized form of the rule { }E3. This is in order to ensure the correctness of the claim that the Core logician can show that the Russell set does not exist.

□

Let us use the abbreviation for the set-term { | ¬ ∈ }. Consider the following proof

$$\begin{array}{cccc} \Xi!r\\ \Sigma & : & \underline{r=\{x \mid \neg x \in x\}} & \underline{\text{Ref.}=} & \underline{r} & \underline{\text{(1)}} & \underline{\overline{\neg r \in r}} & \underline{\text{(1)}}\\ \neg r \in r & & & & & \underline{\bot} & \underline{\text{(1)}} \end{array}$$

Now use Σ to construct the proof

$$\begin{array}{cccc} \Xi ! r & & & \Xi ! r \\ \Xi & : & \overline{r = \{x \mid \neg x \in x\}} & \Xi ! r & \neg r \in r \\ r \in r & & & r \in r \end{array}$$

Finally, embed Ξ twice as follows, to form the (dis)proof

$$\begin{array}{cccc} \mathsf{E}!r\\ \Pi&\vdots&\dfrac{\begin{array}{c} \mathsf{E}!r\\ \hline r=\{\mathsf{x}\,\,|\,\neg\mathsf{x}\in\mathsf{x}\} \end{array}}{\bot} & \dfrac{\begin{array}{c} \mathsf{E}!r\\ \Sigma\\ r\in\mathsf{r} \end{array}}{\bot} & r\in\mathsf{r} & \dfrac{\begin{array}{c} \mathsf{E}!r\\ \hline \neg r\in\mathsf{r} \end{array}}{\bot} & (\mathsf{i})\{\,\,\mathsf{j}\} \end{array}$$

The disproof Π is in normal form. It avails itself of only the following rules:


and all of these would be acceptable to Frege. Conspicuously absent is any appeal to Basic Law V (or, more accurately, (Va)). So Frege was in error in concluding

The error can only lie in our Law (Vb) which must therefore be false.

(Der Fehler kann allein in unserm Gesetze (Vb) liegen, das also falsch sein muss.)

(See the *Nachwort* in Frege (1903), at p. 257.) To be sure, as will be seen by the end of this study, one can use (Va) to get into Russellian trouble; but Russell's paradox can (as just seen) be derived from much more basic logical materials, in a manner whose strict formalization makes no use at all of the beknighted (Va). Moreover, Russell's result, in this free-logical setting, is not a paradox at all. Rather, Π is a normal-form disproof of the claim ∃!, that is, of ∃ ={ | ¬ ∈ }. By the rule ¬I it straightforwardly yields the negative existential theorem ¬∃ ={ | ¬ ∈ } in the logic of sets.

Constructivism and intuitionism in logic and mathematics, especially as formalisms, came well after Volume 2 of Frege's *Grundgesetze*. So Frege cannot be expected to have sought a constructive reductio, such as Π above, of the assumption that the Russell set exists. But it is worth pointing out that Π is indeed constructive. There is

no application within Π of any strictly classical rule for negation. Frege was barking up a wrong tree when he informally invoked the Law of Excluded Middle in his informal presentation (in his *Nachwort* to Volume 2 of the *Grundgesetze*) of the Russellian reasoning. As he pondered the right revisionary response to Russell's paradox, he considered the possibility that one might have to abandon the Law of Excluded Middle. He seriously posed the question

Should we assume the law of excluded middle fails for classes? (Sollen wir annehmen, das Gesetz vom ausgeschlossenem Dritten gelte von den Klassen nicht?)

(See the *Nachwort* in Frege (1903), at p. 254.) We can see now, however, that placing the blame on Excluded Middle, and abandoning it as a logical law, would have been futile. *For the Russell Paradox is a problem for the constructivist*, not just the classicist. Frege could have performed the formal reasoning in Π (even suppressing the middle premise of each application of { }E<sup>1</sup> therein), to reduce the assumption ∃!{ | ¬ ∈ } to absurdity, without any appeal to Excluded Middle.

Frege's ultimate mistake, on this analysis of the Russell Paradox, is his assumption that every grammatically well-formed singular term must denote something. It is ironic that he required this of any 'logically perfect' language. The real folly of Basic Law V is how it visits Fregean 'logical perfection' on set-abstraction terms, *even in the context of an explicitly free logic.* The folly can be seen at work in the direction given by (Va), from coextensiveness of concepts to identity of their extensions. Recall that (Va) can be expressed as a rule as follows:

$$\frac{\forall x(\Phi x \leftrightarrow \Psi x)}{\{x \mid \Phi x\} = \{x \mid \Psi x\}}$$

Take Φ for Ψ. Then *in free logic* we have the proof

$$\begin{array}{c} \vdots \text{Logic} \\ \forall x (\Phi x \leftrightarrow \Phi x) \\ \hline \{x \mid \Phi x\} = \{x \mid \Phi x\} \\ \hline \exists! ! \{x \mid \Phi x\} \end{array} \begin{array}{c} \text{Va} \\ \hline \text{RAD} \end{array}$$

So (Va) commits one to the existence of the set { | Φ} for *every* concept (or predicate) Φ.

It is frequently remarked that Frege's error was to believe in the *Axiom Schema of Naive Comprehension*:

$$\forall \Phi \exists X \forall \mathbf{y} (\mathbf{y} \in X \leftrightarrow \Phi \mathbf{y}) .$$

This is indeed derivable — even in free logic — by appeal to (Va). It is worth pointing out that the derivation makes use only of the *elimination* rules { }E<sup>1</sup> and { }E<sup>3</sup> for the set-abstraction operator. No use is made either of { }E<sup>2</sup> or of the introduction rule. This in turn means that what the following proof reveals is invariant across the 'pure' vs. 'impure' divide.13 Nothing turns, that is, on whether the middle premise of the rule { }I takes the form ∃! or . (We shall have occasion to remark once again on 'pure vs. impure invariance' in due course.)

#### **7 Extensionality**

One might wonder whether the introduction rule for the set-term forming operator plays any philosophically important role. The answer is that it makes a crucial contribution in proving the *extensionality* of sets — that two sets are identical if they have exactly the same members. The proof of this result (Theorem 7.1 below) invokes Theorem 4.1 — whose own proof involved a degenerate application of the rule { }I — and then makes use of a further, *non*-degenerate application of { }I. The natural-deduction theorist proves Theorem 7.1 using the rules for pure set theory (favoring ∃! over as the second premise of { }I). The reader can be left the exercise of modifying the proof so that with the rules involving the result can be established in a suitably '-restricted' form.14

**Theorem 7.1** *Sets are identical if they have exactly the same members.*

*Sentential version:* ∀∀(∀( ∈ ↔ ∈ ) → = )*. Inferential version:* ()

∃! ∃! ∈ *. . .* ∈ () ∈ *. . .* ∈ () =

14 The -modifed result to be proved in the sentential version in Theorem 7.1 is

$$\forall \mathbf{x} (\mathbf{Sx} \to \forall \mathbf{y} (\mathbf{Sy} \to \forall z (z \in \mathbf{x} \leftrightarrow z \in \mathbf{y}) \to \mathbf{x} = \mathbf{y}))).$$

Its inferential version is

$$
\begin{array}{ccc}
\begin{array}{ccc}
\hline
a \in t \\
\vdots
\end{array} & \begin{array}{ccc}
\hline
a \in u \\
\hline
\end{array} & \begin{array}{ccc}
\hline
a \in u \\
\vdots
\end{array} \\
\begin{array}{ccc}
\text{St} & \text{Su} & a \in u \\
& \text{t} = u \\
\hline
\end{array} & \begin{array}{ccc}
\hline
a \in t \\
& \begin{array}{ccc}
\hline
a \in u \\
\hline
\end{array} \\
\end{array} & \begin{array}{ccc}
\hline
\end{array} & \begin{array}{ccc}
\hline
a \in u \\
\hline
\end{array} \\
\end{array} & \begin{array}{ccc}
\hline
\end{array} \\
\end{array} \\
\end{array}$$

<sup>13</sup> This remark is about only the sufciency, and not necessarily the necessity, of eschewal of { }E<sup>1</sup> and { }E<sup>3</sup> for the invariance in question.

#### *Proof* Sentential version:

(1) ∈ ∃! (2) ∀ ( ∈ ↔ ∈ ) ∈ ↔ ∈ (1) ∈ ∈ (4) ∃! (1) ∈ ∃! (2) ∀ ( ∈ ↔ ∈ ) ∈ ↔ ∈ (1) ∈ ∈ (1) { }I = { | ∈ } (3) ∃! (Th.4.1) = { | ∈ } = (2) ∀ ( ∈ ↔ ∈ ) → = (3) ∀(∀ ( ∈ ↔ ∈ ) → = ) (4) ∀∀(∀ ( ∈ ↔ ∈ ) → = )

Inferential version:

$$\begin{array}{ccc} \overline{a \in u}^{(i)} & \overline{a \in t}^{(i)} \\ \vdots & \vdots \\ \overline{a \in t} & \overline{\exists! t \\ \underline{t = \{x \mid x \in u\}}} & \overline{\{1\}\{\{\}\}} & \overline{u = \{x \mid x \in u\}} \\ \hline & & \underline{t = \{x \mid x \in u\}} \end{array} (\text{Th.4.1})$$

Note that Extensionality as a *derived* result here — in either its sentential or its inferential version — does not itself contain any occurrences of the set-abstraction operator. In conventional (frst-order) set theory, in the usual stripped-down language with ∈ as a predicate but *without* the set-abstraction operator primitive, one would need of course to follow Zermelo in *postulating* Extensionality as an axiom. One of the virtues of the natural-deduction rules essayed here is that Extensionality is 'built in' to the resulting conception of set, whether one is theorizing about pure or about impure sets.

#### **8 Natural-deduction for set abstraction vs. Fregean abstraction**

The question now arises: just how radical a departure from the (incoherent) Fregean conception of class (or set) is represented by the conception captured by the natural deduction rules for the set-abstraction operator, in a free logic? What is the relationship between the latter rules, and Frege's Basic Law V (whose half called Vb is what Frege — mistakenly, in our free-logical view — blamed for Russell's paradox)?

The following completely formal derivations will reveal the answer.

The innocent half of Basic Law V (call it Vb):

$$(\mathbf{Vb}) \quad \frac{\{\mathbf{x} \mid F\mathbf{x}\} = \{\mathbf{x} \mid G\mathbf{x}\}}{\forall \mathbf{x} (F\mathbf{x} \leftrightarrow G\mathbf{x})}$$

can be derived by the natural-deduction theorist (see Theorem 8.1) *and by the Fregean* (see Lemma 8.2). This latter result is interesting; it stems from just the rules F1 and F2 adopted by the Fregean who assumes a 'logically perfect' language. So the

Frege who would venture to adopt both F1 and F2 as logically correct and primitive inferences need not have bothered to state Vb as a basic law, or as a conjunctive part of any other basic law (such as the relevant half, in one direction, of the biconditional Basic Law V).

#### **Theorem 8.1** { }E1*,* { }E<sup>3</sup> ⇒ Vb*.*

*Proof* Note the symmetry in the proof (to be expected with '=' and '↔'), and the use of the two main elimination rules for { }, but not of the introduction rule. This is another instance of 'pure vs. impure invariance'.

{ | }={ | } ∃!{ | } { | }={ | } { | }={ | } (2) ∃! (1) { }E1 ∈ { | } { }E3 { | }={ | } ∃!{ | } { | }={ | } { | }={ | } (2) ∃! (1) { }E1 ∈ { | } { }E<sup>3</sup> (1) <sup>↔</sup> (2) ∀ ( ↔ ) □

#### **Lemma 8.2** F1, F2 ⇒ Vb*.*

$$\begin{array}{cc} \text{Proof} & \\ \hline \hline \overline{Ga} & \\ \hline a \in \{\mathbf{x} \mid \operatorname{Gx}\} & \\ \hline & a \in \{\mathbf{x} \mid \operatorname{Fx}\} \\ \hline & \overline{Fa} & \\ \hline & \overline{Fa} & \\ \hline & \overline{Fa} & \\ \hline & \overline{\operatorname{\forall x \mid \operatorname{Fx}}} & \\ \end{array} \quad \begin{array}{c} \overline{Fa}^{(1)} \\ \hline \hline \overline{a \in \{\mathbf{x} \mid \operatorname{Fx}\}}^{(1)} \\ \hline \overline{a \in \{\mathbf{x} \mid \operatorname{Fx}\}}^{\operatorname{\forall 1}} & \\ \hline \overline{a \in \{\mathbf{x} \mid \operatorname{Gx}\}}^{\operatorname{\forall 2}} & \\ \hline \end{array}$$

Can Lemma 8.2 be strengthened by using the 'free logical' Fregean rule fF1 in place of F1? The answer is afrmative. See Lemma 8.9.

The natural-deduction theorist can prove not the converse of Vb — which would be the inconsistent Va — but a slight weakening of it, by adding an existential presupposition. The inference to be established is

$$\text{fVa} \quad \frac{\forall \mathfrak{x} (F\mathfrak{x} \leftrightarrow G\mathfrak{x}) \quad \exists! \{\mathfrak{x} \mid F\mathfrak{x}\}}{\{\mathfrak{x} \mid F\mathfrak{x}\} = \{\mathfrak{x} \mid G\mathfrak{x}\}}$$

.

**Theorem 8.3** { }E1*,* { }E3*,* { }I ⇒ fVa*.*

*Proof*

Note that easy re-lettering will enable one to use the premise ∃!{ | } instead of ∃!{ | }. The proof of Theorem 8.3 uses only the two main elimination rules for { }, along with its introduction rule.

The natural-deduction theorist's proof of Theorem 8.3 just given can be adapted so as to employ the rule { }I in its -modifed form. The adapted proof follows. Note that it appeals to Observation 8.4 in its middle immediate subproof. The natural-deduction theorist's proof of Observation 8.4 is presumed to be available here, and will be found on p. 106. It involves only two primitive steps.

So we see that Basic Law V, suitably conditioned in its problematic direction with a much-needed existential premise, is derivable in our free logic for the set-term forming operator { }. It is fVa that is the (modifed) Fregean way to express the fact that sets are extensional; and in deriving it, the natural-deduction theorist needs to use { }I.

Is the natural-deduction theorist's logic of sets tantamount to nothing more than Basic Law V thus modifed? The answer would presumably be afrmative if, but only if, by using the inferences

$$\frac{\{\mathbf{x} \mid F\mathbf{x}\} = \{\mathbf{x} \mid G\mathbf{x}\}}{\forall \mathbf{x} (F\mathbf{x} \leftrightarrow G\mathbf{x})} \qquad\qquad \frac{\forall \mathbf{x} (F\mathbf{x} \leftrightarrow G\mathbf{x}) \quad \exists! \{\mathbf{x} \mid F\mathbf{x}\} \frac{\}}{\{\mathbf{x} \mid F\mathbf{x}\} = \{\mathbf{x} \mid G\mathbf{x}\} \dots}$$

the free-logical Fregean could derive (in free logic) the introduction and elimination rules that have been stated for { }. The rules { }E<sup>1</sup> and { }E<sup>3</sup> have already been furnished with derivations of the requisite kind (see Theorems 5.2 and 5.3). The reader will recall that (in the case of pure set theory) { }E<sup>2</sup> is a special case of the rule RAD of free logic. And in the case of a set theory countenancing *Urelemente*, the -modifed rule

$$\{\ \ \rangle \mathbf{E}\_2 \quad \frac{t = \{x \mid \Phi\}}{St}$$

will in due course be adopted by (us, on behalf of) the Fregean as a primitive rule, to be called F3 (see below). The remaining task, then, for the Fregean, is to derive { }I. Can this be done?

The answer is a cautious afrmative. The caution is occasioned by the residual need *on the part of the Fregean* to supply the inferential transition occurring twice in the following proof, as indicated by the descending dots:

Δ2 Π2 ∃! . . . = { | ∈ } ∃!{ | ∈ } (1) , (2) ∃! , Δ<sup>1</sup> | {z } Π1 ∈ (1) ∈ , Δ<sup>3</sup> | {z } Π3 (1) ∈ ↔ (2) ∀( ∈ ↔ ) fVa { <sup>|</sup> <sup>∈</sup> } <sup>=</sup> { <sup>|</sup> } Δ2 Π2 ∃! . . . = { | ∈ } = { | }

That inferential transition is of course guaranteed by Theorem 4.1, to which the free-logical, natural-deduction theorist about sets is entitled. But would either the original Frege, or the speculatively 'free-logicized' Frege, be thus entitled?

Let us explore some conceptual and logical possibilities on Frege's behalf. We know that he conceived of his *Begrifsschrift* as a language for the pursuit of truths not just about the abstract realm, but also about concrete reality. So he would not have been satisfed with being confned to talk only of sets. 'Pure-set' theorizing would have been too restrictive, expressively, for Frege. So it is reasonable to infer that he would have been prepared to adopt a primitive predicate like , so as to be able to express the informal idea, concerning any supplied argument (*Gegenstand*), that it is a set — that is, (for Frege) the completed extension of some concept. For any *Urelement*, of course, what is thus expressed is false.

Countenancing the possibility of *Urelemente*, Frege would have refused— correctly — to adopt the inferential principle

$$\frac{\Xi^{\bullet}}{S^{\bullet}}.$$

But he would have been happy — rightly — with its converse:

$$\frac{St}{\Xi^{\ddagger}}\text{ .}$$

This rule says that every set exists. And, since is an atomic predication, the rule is a special case of the Rule of Atomic Denotation.

The distinction was drawn earlier between *pure set* theorizing and *impure set* theorizing. Frege's theorizing was of the latter kind, since he regarded the universe of discourse as truly universal, and therefore containing all concrete objects (*Urelemente*, such as Julius Caesar) along with sets formed from them (such as the singleton {Julius Caesar}), *and* along with sets that happened to be hereditarily pure (such as { | ={ | ¬ =}}). Suppose, however, that Frege had been asked to theorize in a more focused way about the (sub-)universe (for him) of hereditarily pure sets. It would have been both obvious and natural for him to give expression to this expressive focus by adopting the following rule (for 'purity'):

$$\pi \quad \frac{\exists t}{St}$$

where the intended reading of '' is (as always here) ' is a set'. It will follow from this that is *hereditarily pure* as a set, since every member of exists, hence (by rule ) is a set. Since no *Urelement* is a set:

$$\frac{Ut}{\perp}$$

the membership pedigree of does not contain any *Urelemente*, and is accordingly pure. A little bit of sethood, given mere existence, goes a long way. The foregoing rule of contrariety is primitive for both the Fregean and the natural-deduction theorist when they allow for *Urelemente*.

□

What would it take, then, on the part of some *Gegenstand* , to earn the honorifc sortal from Frege? Surely it would be enough that really *be* the extension of some concept Φ. That is, the following inferential principle would be logico-analytically valid for the Fregean, and self-evident:

$$\text{F3} \quad \frac{t = \{x \mid \Phi(x)\}}{St}$$

This rules says that if is the set of all Φs, then is a set. It is primitive for the ND-theorist (for it is the set-abstraction elimination rule { }E2), and the Fregean has every right to adopt it as a primitive rule too.

**Observation 8.4** <sup>∃</sup>!{ <sup>|</sup> <sup>Φ</sup>()} { | Φ()} *.*

*Proof (by the Fregean)*

$$\begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} a=\{\boldsymbol{x} \mid \,\Phi(\boldsymbol{x})\} \end{array} \begin{array}{c} \begin{array}{c} \text{(1)}\\ \text{(2)} \end{array} \\ \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(1)}\\ a=\{\boldsymbol{x} \mid \,\Phi(\boldsymbol{x})\} \end{array} \begin{array}{c} \text{(1)}\\ a=\{\boldsymbol{x} \mid \,\Phi(\boldsymbol{x})\} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \text{(1)}\\ a=\{\boldsymbol{x} \mid \,\Phi(\boldsymbol{x})\} \end{array} \begin{array}{c} \text{(1)}\\ \text{(2)}\\ \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \text{(1)}\\ \text{(2)}\\ \text{(1)}\\ \end{array} \end{array}$$

*Proof (by the ND-theorist)*

$$\frac{\exists!\{\boldsymbol{x}\mid\boldsymbol{\Phi}(\boldsymbol{x})\}}{\{\boldsymbol{x}\mid\boldsymbol{\Phi}(\boldsymbol{x})\}=\{\boldsymbol{x}\mid\boldsymbol{\Phi}(\boldsymbol{x})\}}\operatorname{Ref.} =$$

Observation 8.4 tells us that even though, as pointed out above, the inference

$$\frac{\Xi^{\prime t}}{St}$$

does not hold in general, it *does* hold when, more specifcally, we have a set-abstraction term in place of .

We have been considering a Frege who distinguishes between sets and *Urelemente*. So for fair comparison of his system with that of the free-logical natural-deduction theorist, the latter must give up commitment to theorizing only about sets. This means that the rule { }I must have its second premise in the form ; and, correspondingly, the elimination rule { }E<sup>2</sup> will be

$$\underline{\mathfrak{r}} = \{\underline{\mathfrak{x}} \mid \Phi\}$$

The sought derivation of the -modifed rule { }I using Fregean principles will eventually be found. See Lemma 8.8 below.

Now consider the prospect of having some sort of converse of principle F3. Suppose one is given just the premise . Then one knows that is the extension of *some* concept or other. And what might be the most general — indeed canonical — concept that would ft this bill? Why,

'. . . is a member of ',

of course. So the following inferential principle would be logico-analytically valid for the Fregean:

$$\mathbf{F4} \quad \frac{\mathbf{St}}{t = \{x \mid x \in t\}}$$

Note that the primitive rule F4 here being accepted by (us, on behalf of) the Fregean is Theorem 4.2 of the natural-deduction theorist. Both theorists have it *as part of their logic of sets*. The only diference is that for the Fregean it is primitive — because it has to be — whereas for the natural-deduction theorist it is derived.

Why do we say that F4 *has to be* primitive for the Fregean? The answer is that inspection of all of the Fregean's other primitive rules — RAD, Sub=, Ref=, fF1, F2, F3, fVa, and — reveals that one cannot use them to derive the conclusion = { | ∈ } from the premise .

The 'pure-set' Fregean, by adopting the rules and F4, is able to mimic the natural-deduction theorist's Theorem 4.1 as follows.

$$\begin{array}{ll}\textbf{Lemma 8.5} & \frac{\boxed{}}{t = \{x \mid x \in t\}} \ . \\\\ \textbf{Proof} & \frac{\boxed{}}{St} \ . \\\\ \end{array} \quad \begin{array}{ll} \boxed{} \\ \end{array} \quad \begin{array}{ll} \boxed{} \\ \end{array} \quad . \\\\ \textbf{Proof} & \frac{\boxed{}}{St} \ . \\\\ \end{array}$$

The question arises: can the Fregeans prove Theorem 7.1 using their own rules thus far, so as to parallel and emulate what the natural-deduction theorist did? The answer is afrmative. We shall deal with Extensionality in its inferential form.

*.*

$$\begin{array}{ccccc} & & & \overline{a \in \mathfrak{t}} & \stackrel{(i)}{\quad} & \overline{a \in \mathfrak{u}} & \stackrel{(i)}{\quad} \\ \text{Lemma 8.6 F4, \pi, f \text{Va}} \Rightarrow & & \vdots & \vdots \\ & & \overline{\exists!t} & \exists!u & a \in \mathfrak{u} & \stackrel{a \in \mathfrak{t}}{\quad} & (i) \\ & & & t = \mathfrak{u} & & (i) \end{array}$$

*Proof* (1) ∈ . . . ∈ (1) ∈ . . . ∈ (1) ∈ ↔ ∈ ∀( ∈ ↔ ∈ ) ∃!{ | ∈ } fVa { <sup>|</sup> <sup>∈</sup> }={ <sup>|</sup> <sup>∈</sup> } ∃! L8.5 ={ | ∈ } ={ | ∈ } ∃! L8.5 ={ | ∈ } = □

Fregean rule F4 is potent in another important regard. Teaming up with fVa, it yields the natural-deduction theorist's introduction rule { }I — both in its original version (ensuring purity of sets) and in its -modifed version (allowing for *Urelemente*) as shown, respectively, by Lemma 8.7 and Lemma 8.8.

**Lemma 8.7** F4*,* fVa*,* ⇒ { }I (pure-set version)*.*

*Proof* Recall that the proof of Lemma 8.5 used F4.

Δ2 Π2 ∃! L8.5 = { | ∈ } ∃!{ | ∈ } (1) , (2) ∃! , Δ<sup>1</sup> | {z } Π1 ∈ (1) ∈ , Δ<sup>3</sup> | {z } Π3 (1) ∈ ↔ (2) ∀( ∈ ↔ ) fVa { <sup>|</sup> <sup>∈</sup> } <sup>=</sup> { <sup>|</sup> } Δ2 Π2 ∃! L8.5 = { | ∈ } = { | } □

**Lemma 8.8** F4*,* fVa ⇒ { }I (-modifed version)*.*

*Proof*

Δ2 Π2 F4 = { | ∈ } ∃!{ | ∈ } (1) , (2) ∃! , Δ<sup>1</sup> | {z } Π1 ∈ (1) ∈ , Δ<sup>3</sup> | {z } Π3 (1) ∈ ↔ (2) ∀( ∈ ↔ ) fVa { <sup>|</sup> <sup>∈</sup> } <sup>=</sup> { <sup>|</sup> } Δ2 Π2 F4 = { | ∈ } = { | } □

By Lemma 8.7, the Fregean rules F4, fVa, and sufce for proof of { }I in its original form (ensuring purity); and by Lemma 8.8, F4 and fVa sufce for proof of { }I in its -modifed form (allowing for *Urelemente*). No matter which form of it is used, { }I in turn sufces for proof of extensionality in the relevant form (with unrestricted or -restricted quantifers, as seen from Theorem 7.1 and the comment thereon in footnote 14).

Recall Theorem 8.1:

{ }E1, { }E<sup>3</sup> ⇒ Vb,

Lemma 5.4:

$$\mathbf{fF} \mathbf{l} \implies \{\,\,\} \mathbf{E}\_1,$$

and Lemma 5.3:

F2 ⇒ { }E3.

It follows by 'rule transitivity' that

fF1, F2 ⇒ Vb.

Here is a more direct proof of this last result. It can be obtained by accumulating the proofs of Theorem 8.1, Lemma 5.4, and Lemma 5.3, and applying to that accumulation the two 'shrinking reductions' that the reader will fnd are obviously called for.

#### **Lemma 8.9** fF1, F2 ⇒ Vb*.*

$$\begin{array}{l} \textbf{Proof} \\ \hline \overline{Ga} \xrightarrow{(1)} \begin{array}{l} \exists!a \left[ \begin{array}{c} \exists!a \end{array} \right] \begin{array}{l} \{\mathtt{x}\} \{F\mathsf{x}\} = \{\mathtt{x}\} \{G\mathsf{x}\} \\ \exists!\{\mathtt{x}\} \{G\mathsf{x}\} \end{array} \\ \hline \overline{a \in \{\mathtt{x}\} \{G\mathsf{x}\}} \quad \overline{\{\mathtt{x}\} \{F\mathsf{x}\} = \{\mathtt{x}\} \{G\mathsf{x}\}} \\ \hline \overline{\begin{array}{l} a \in \{\mathtt{x}\} \{F\mathsf{x}\} \\ \exists \mathtt{x} \end{array} \end{array} & \begin{array}{l} \overline{Fa} \ \overline{\begin{array}{l} \exists!a \end{array} \begin{array}{l} \exists!a \end{array} \begin{array}{l} \exists!\{\mathtt{x}\} \{F\mathsf{x}\} = \{\mathtt{x}\} \{G\mathsf{x}\} \\ \exists!\{\mathtt{x}\} \{F\mathsf{x}\} \end{array} \\ \hline \overline{\begin{array}{l} a \in \{\mathtt{x}\} \{F\mathsf{x}\} \\ \overline{Ga} \end{array} \quad \overline{\begin{array}{l} a \in \{\mathtt{x}\} \{G\mathsf{x}\} \\ \exists \mathtt{x} \end{array} \end{array} & \begin{array}{l} \underline{a \in \{\mathtt{x}\} \{F\mathsf{x}\} } \begin{array}{l} \exists!\{\mathtt{x}\} \{F\mathsf{x}\} \\ \Box \end{array} \end{array} \\ \end{array}$$

So Vb is redundant for the (modifed) Fregean. Note how no use is made of Vb in the (modifed) Fregean's derivations of the ND-rules.

#### **9 Taking stock**

Let us take stock of the progress made thus far, in our comparison of the 'freelogical' Fregean with the (likewise 'free-logical') natural-deduction set theorist. Let us henceforth call each of them simply *free*, rather than 'free-logical'.

Bear in mind the following three crucial points of methodological agreement between them.


Here are the basic inferential principles *for sets* espoused by these two theorists.

**The free Fregean's basic inferential principles for sets:**

$$\text{fFl} \quad \frac{\Phi(t) \quad \exists! \{\boldsymbol{x} \mid \Phi(\boldsymbol{x})\} \quad \exists! \mathsf{l} \\ \text{t} \qquad \qquad \mathsf{F} \mathsf{l} \qquad \begin{array}{c} \mathsf{F} \mathsf{4} \quad \ \frac{\mathsf{St}}{\mathsf{f} \mathsf{s} \mid \mathsf{x} \in \mathsf{t}} \end{array}$$

110 Neil Tennant

$$\begin{array}{ccccc} \text{F2} & \frac{t \in \{\mathbf{x} \mid \Phi(\mathbf{x})\}}{\Phi(t)} & & \text{f/\text{Av}} & \frac{\forall \mathbf{x} (\Phi \mathbf{x} \leftrightarrow \Psi \mathbf{x}) & \exists! \{\mathbf{x} \mid \Phi \mathbf{x}\} \\\\ \text{F3} & \frac{t = \{\mathbf{x} \mid \Phi(\mathbf{x})\}}{St} & & \pi & \frac{\exists! t}{St} \\ \end{array}$$

#### **The free natural-deduction theorist's basic inferential principles for sets:**

$$\{\{\}\}\text{I} \qquad \begin{array}{c} (i) \underbrace{\exists!a^{}}\ \underbrace{\begin{array}{c} \exists!a^{\times}} \end{array} \{\}\text{ } \begin{array}{c} (i) \\ \vdots\\ \vdots\\ a \in \, t \end{array} \end{array} \qquad \begin{array}{c} (i) \\ \overline{a \in \, t} \end{array} \{\,\, (i) \\ \begin{array}{c} \exists!a \in \, t \end{array} \, \begin{array}{c} \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \text{ } \exists!a \end{array} \rangle$$

$$\{\,\,\} \to\_1 \begin{array}{ll} t = \{x \mid \Phi\} & \exists! ! \,\nu & \Phi\_{\nu}^{x} \\ \hline \nu \in t \end{array} \qquad\qquad \{\,\,\} \to\_2 \begin{array}{ll} t = \{x \mid \Phi\} \\ \hline \exists! t \end{array}$$

{ }E<sup>3</sup> serial form: parallelized form: = { | Φ} ∈ Φ = { | Φ} ∈ () Φ . . . () 

A remarkable contrast strikes the eye. The (free) Fregean states rules that make no provision for discharge of assumptions, whether in sentential form or in rule form. The natural-deduction theorist, however, states { }I so as to allow discharge of certain assumptions. It is also a single-barreled rule. These are crucial reasons why the latter's rules, overall, provide a more succinct and unifed account of the interrelations among the concepts involved. This is the case even when both theorists are employing free logic and are achieving pure vs. impure invariance in their theorizing about sets. We have learned the lesson that the Gentzenian approach, allowing for rules that efect discharge of assumptions, is an essential advance over the Fregean one, and frees the set-logician to do more with less.

.

**Summary of the Fregean's results:** Lemma 5.2: F1 ⇒ { }E1.

Lemma 5.3: F2 ⇒ { }E3. Lemma 5.4: fF1 ⇒ { }E1. F3 is, hence ⇒ { }E2. Lemma 8.2: F1, F2 ⇒ Vb. Lemma 8.5: ∃! ={ | ∈ } . Lemma 8.6: F4, , fVa ⇒ ∃! ∃! () ∈ . . . ∈ () ∈ . . . ∈ () =

Lemma 8.7: F4, fVa, ⇒ { }I in its original form (ensuring purity). Lemma 8.8: F4, fVa ⇒ { }I in its -modifed form (allowing *Urelemente*). Lemma 8.9: fF1, F2 ⇒ Vb.

**Summary of the ND-theorist's results:** Theorem 4.1: ∃! ={ | ∈ }

Theorem 4.2: { }I ⇒ F4. Theorem 5.1: { }E<sup>3</sup> ⇒ F2. Theorem 5.5: { }E<sup>1</sup> ⇒ fF1. { }E<sup>2</sup> is, hence ⇒ F3.

Theorem 7.1: { }I ⇒ ∃! ∃! ∈ . . . ∈ ∈ . . . ∈ () = Theorem 8.1: { }E1, { }E<sup>3</sup> ⇒ Vb. Theorem 8.3: { }E1, { }E3, { }I ⇒ fVa.

#### **10 Summary of our comparison of the free Fregean approach with the free ND-approach**

()

()

.

Clearly, the free Fregean approach is equivalent to the free ND-approach. Each primitive rule of the one theorist is either primitive for, or derivable by, the other theorist. The free Fregean, however, adopts as primitive the rule F4, which the NDtheorist easily but non-trivially derives. Furthermore, comparison of their respective proofs of the non-trivially derivable result of Extensionality reveals that the ND-proof is more succinct than the Fregean one. They tie, however, in proving Vb — taking ten primitive steps each.

There is a satisfying unity to the ND-approach that is lacking in the Fregean approach. Having harmoniously balanced introduction and elimination rules for set-abstraction is a defnite plus. These rules require only minor tweaks to toggle between the pure and the impure conceptions of sets. The Fregean, by contrast, resorts to adopting two new primitive rules—F4 and —to ensure restriction to pure sets.

The extension of Gentzenian methods from the usual logical operators so as to include also the operator for set-abstraction appears to yield a methodological advantage. If the methods of natural deduction had been available to Frege,the tradition could arguably have delivered an ontologically non-committal *logical foundation* for abstraction, membership, sethood and predication. On that foundation Zermelo and his successors could then have built further, by supplying axioms and axiom schemata for outright existence (e.g., the empty set) and conditional existence (e.g., the pair set of any two things).

The original sin revealed by Russell's paradox can be viewed, through this new lens, as Frege's insistence that every well-formed singular term denotes. The fateful Va codifed that insistence as it concerned set-abstraction terms in particular. If Frege's erroneous conception of 'logical perfection' could have been eliminated earlier, the

.

route would have been cleared to his acceptance of all the principles of modern free logic. He could have had a logic of sets, but without any sets as logical objects. That the set-term { | ¬ ∈ } does not denote would then have been no more disastrous a discovery than that the defnite descriptive term (Φ ∧ ¬Φ) does not either.

Foundationalists know, as practicing mathematicians themselves, that all our mathematical reasoning is (intuitively) *relevant*, and therefore should be able to be regimented in a formal logic devoid of the paradoxes of irrelevance. There is foundational dispute over whether mathematical reasoning should be *constructive*; and we know now, even from a constructive standpoint, that the classical extensions of constructive theories are consistent if the latter are. The lesson that emerges here is that adopting a *free* logic for the foundations of mathematics appears to be crucial for the *constructivist* to ensure consistency.

This brings us to the end of our comparison of the Fregean and the natural-deduction theorist's approaches to the primitive rules for the *logic* of sets.

We turn now to explain how the natural-deduction theorist can pursue the project of framing introduction and elimination rules even further, so as to capture all the familiar 'compiled' concepts of set theory. The main philosophical lesson of the remaining part of our investigation is that one can capture the meanings of set-theoretic predicates and operators without incurring any ontological commitments. The latter commitments result only from subsequent existential postulations, either outright or conditional. But the concepts embedded in such postulates are *already understood*, thanks to the ontologically non-committal rules of introduction and elimination that govern them in a *free* logic. So the main 'takeaway' is that the postulates of set theory do *not* serve 'implicitly to defne' the concepts involved. Rather, those concepts are already available to be grasped *before* any existential postulation is undertaken. Such is the power of harmoniously balanced introduction and elimination rules.

#### **11 An inferentialist treatment of set-theoretic pasigraphs**

At present our ofcial list of primitive expressions contains only the logical operators ¬, ∧, ∨, →, ∃, and ∀, the identity predicate =, the membership predicate ∈, and the (singular) term-forming variable-binding set-abstraction operator { | . . . . . .}.

Set theory, however, is not laid out in such austerely primitive vocabulary. Set theorists and ordinary mathematicians making use of set-theoretic ideas employ a host of already familiar-looking *defned* expressions (such as '⊆' for 'is a subset of', and '' for 'the power set of'). These defned expressions are indispensable for communicating in a conveniently condensed fashion what would otherwise be extremely cumbrously expressed set-theoretical thoughts. The inferentialist seeks to frame rules of introduction and elimination for these defned expressions, so that they can be understood as being employed as 'local primitives' in mathematical discourse of the normal explicit texture.

When these defned function-operators and predicates are supplied with rules of inference, we call those operators and predicates *pasigraphs*, because they are so readily recognizable as pieces of notation in use, despite being absent from the ofcial list of primitive expressions. In some cases we shall invent new pasigraphs, in order to express in yet more succinct symbolic form what practicing set theorists often render only in 'logician's English'. This enables one to be uniform and thorough in rigorously regimenting informal set-theoretical proofs as formal, logical proofs.

Every defned notion in set theory (as in any other branch of mathematics) stands at the apex of its own 'pyramid of preceding defnitions'. The ofcial primitive expressions form its base. There are of course only fnitely many preceding defnitions in any such pyramid; the process of constructing new concepts is always well-founded. Those defnitions 'lower down' in the pyramid will be of pasigraphs that can be employed in the statement of introduction and elimination rules for the new notion at the apex. Not all of them need be thus employed; but they are eligible to be. Typically, the new notion at the apex will have its rules framed by using earlier notions just a layer or two down. But this is not necessarily the case in general.

We recall the special rule of free logic called the *Rule of Atomic Denotation*:

$$\frac{A(...,t,...)}{\exists!!}$$

We saw also that in free logic for languages containing function signs as primitives, there is the *Rule of Functional Denotation*:

$$\frac{\exists!f(.\ldots,t,\ldots)}{\exists!t}$$

We shall adopt the following constraint on the formulation of any new concepts represented by our pasigraphs:


#### **11.1 Pasigraphs for restricted quantifcations**

We shall start with a pasigraph that is neither an operator- nor a predicate-pasigraph. We are very familiar with the following form of generalization:

$$\forall \mathbf{x} \in t \,\varphi(\mathbf{x})\text{.}$$

But strictly speaking the universal quantifer ∀, as a logical primitive, is a *unary* quantifer; so the foregoing form is that of a binary-quantifer *pasigraph*, which needs either to be defned explicitly:

$$\forall \mathbf{x} (\mathbf{x} \in \mathfrak{r} \to \varphi(\mathbf{x}))$$

or to have its meaning specifed by rules that can govern it as an apparent primitive:

$$\begin{array}{cccc} & \overline{a \in \mathfrak{t}}^{\left(\begin{array}{c} (i) \\ \end{array}\right)} & & \\ \mathsf{\forall I} & \vdots & \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \overline{\varphi\_{a}} \equiv t \,\,\varphi \qquad \qquad u \in t \\ & & \overline{\varphi\_{u}} \in t \,\,\varphi \end{array}$$

We are very familiar also with the following form of generalization:

$$
\exists x \in t \,\varphi(x).
$$

But strictly speaking the existential quantifer ∃, as a logical primitive, is a *unary* quantifer; so the foregoing form is that of a binary-quantifer *pasigraph*, which needs either to be defned explicitly:

$$\exists \mathfrak{x} (\mathfrak{x} \in \mathfrak{t} \land \varphi(\mathfrak{x})) $$

or to have its meaning specifed by rules that can govern it as an apparent primitive:

∃I ∈ ∃ ∈ ∃E ∃ ∈ () ∈ , () | {z } . . . () 

#### **11.2 Pasigraph** ∅ **for empty set**

In primitive set-theoretic vocabulary the Axiom of Empty Set is as follows:

$$\exists! \{x \mid \neg x = x\}.$$

We shall now introduce the *constant pasigraph* ∅ for the empty set, governed by the following (existentially non-committal) rules.

$$\begin{array}{ccccc} & & \overline{a \in \mathfrak{t}} & & \\ \mathsf{q}\mathsf{I} & & \vdots & & \\ & \overline{\exists!\mathfrak{t}}\mathsf{t} & \mathsf{\tau} & & \mathsf{q}\mathsf{E} & \overline{\exists!\mathfrak{t}} \\ & & & & \overline{\exists!\mathfrak{t}}\mathsf{t} & \mathsf{\tau} \end{array} \qquad\qquad \qquad \qquad \qquad \qquad \qquad \mathsf{q}\mathsf{E} & \quad \mathsf{\tau} = \mathsf{q} \qquad \qquad \qquad \qquad \mathsf{t} = \mathsf{q} \qquad \mathsf{u} \in \mathfrak{t} \end{array}$$

The Axiom of Empty Set can now take the form ∃!∅. This is an outright existence postulate. Its inferential form is the zero-premise rule

∃!∅ .

#### **11.3 Pasigraph for separated sets**

Suppose has free. Then

$$\{\mathfrak{x} \in t \mid \varphi\} =\_{df} \{\mathfrak{x} \mid \mathfrak{x} \in t \wedge \varphi\}$$

The Axiom Scheme of Separation can now take the rule form

$$\frac{\exists!\mathfrak{r}\_{\parallel}}{\exists!\{x\in t\mid\varphi\}}$$

Instead of stipulating that { ∈ | } is an abbreviation of { | ∈ ∧ }, we could adopt as a grammatical primitive the term-formation operation on a term and a formula with free, that produces { ∈ | } as a genuine term of the language, from the two constituents mentioned. We could then furnish the operation with its own introduction and elimination rules as follows.

() , () ∈ | {z } . . . ∈ ∃! () ∈ . . . ∈ () ∈ . . . () = { ∈ | } = { ∈ | } ∈ ∈ = { ∈ | } ∃! = { ∈ | } ∈ ∈ = { ∈ | } ∈ 

By virtue of the foregoing Introduction and Elimination Rules for the Separation Pasigraph, along with the Introduction and Elimination Rules for the Set-Abstraction Operator, we have

$$\mathfrak{a} = \{ \mathfrak{x} \in \nu \mid \varphi \} \vdash \mathfrak{t} = \{ \mathfrak{x} \mid \mathfrak{x} \in \nu \land \varphi \}.$$

and its converse

$$\mathfrak{t} = \{ \mathfrak{x} \mid \mathfrak{x} \in \mathfrak{v} \land \varphi \} \vdash \mathfrak{t} = \{ \mathfrak{x} \in \mathfrak{v} \mid \varphi \}.$$

#### **11.4 Pasigraph for pair-sets**

In primitive set-theoretic vocabulary the Axiom of Pairing is as follows:

$$\forall x \forall y \exists! \{z \mid z = x \lor z = y\}$$

We shall now introduce a (binary) *operator pasigraph*.

We shall write ℙ(, ) for the pair set { | = ∨ =}. Note that the 'pair' set ℙ(, ) is a *singleton* if = (whence both and exist).

The Axiom of Pairing can now also be expressed as follows:

$$
\forall x \forall y \exists! \mathbb{P}(x, y).
$$

The inferentialist working in free logic requires the interdeducibility

$$\mathfrak{a} = \mathbb{P}(\mathfrak{u}, \nu) \dashv \vdash \mathfrak{t} = \{ z \mid z = \mathfrak{u} \lor z = \nu \}$$

rather than the provability of the identity

$$\mathbb{P}(\mu, \nu) = \{ z \mid z = \mu \lor z = \nu \},$$

The latter identity commits one to the existence of the pair set of and :

$$
\exists!\mathbb{P}(\mu,\nu).
$$

The interdeducibility, however, does not carry such existential commitment. It allows one to pin down the meaning of ℙ as an operator on sets without committing one to its being everywhere (or indeed: *any*where) defned. One can grasp what ℙ means without yet adopting the Axiom of Pair Sets. And when we *do* adopt that axiom, it does not serve implicitly to defne the meaning of ℙ. For that meaning will already have been defned by the Introduction and Elimination rules that we frame for ℙ.

We propose the following introduction and elimination rules for the pairing operator ℙ.

ℙ-I ∈ ∈ () ∈ . . . = ∨ = () =ℙ(, )

$$\mathbb{P}\text{-}\mathbf{E}\_{1} \quad \frac{t=\mathbb{P}(\boldsymbol{\mu},\boldsymbol{\nu})}{\boldsymbol{\mu}\in\boldsymbol{t}} \qquad \mathbb{P}\text{-}\mathbf{E}\_{2} \quad \frac{t=\mathbb{P}(\boldsymbol{\mu},\boldsymbol{\nu})}{\boldsymbol{\nu}\in\boldsymbol{t}} \qquad \mathbb{P}\text{-}\mathbf{E}\_{3} \quad \frac{t=\mathbb{P}(\boldsymbol{\mu},\boldsymbol{\nu})}{\boldsymbol{\nu}=\boldsymbol{\mu}\vee\boldsymbol{\nu}=\boldsymbol{\nu}}$$

**Lemma 11.1** *The operator pasigraph* ℙ *obeys the Rule of Functional Denotation; that is, the following are provable:*

$$\frac{\exists!\mathbb{P}(\mu,\nu)}{\exists!\mathbb{u}}\qquad\qquad\qquad\qquad\frac{\exists!\mathbb{P}(\mu,\nu)}{\exists!\mathbb{v}}\,.$$

*Proof*

$$\begin{array}{llll} \mathsf{T} \mathsf{!P}(\mathsf{u},\mathsf{v}), \ \mathsf{i.e.}, & \overline{\begin{array}{c} a=\mathsf{P}(\mathsf{u},\mathsf{v})} \\ \underline{\begin{array}{c} \mu\in\mathsf{e}\mathsf{e}}\mathsf{ }\mathsf{a}\end{array} \mathsf{R}\mathsf{A}\mathsf{D} \\ \hline \end{array}} & \begin{array}{c} \overline{\begin{array}{c} \mu\in\mathsf{e}\mathsf{e}}\mathsf{ }\mu\text{ }\mathsf{e},\mathsf{v} \\ \overline{\exists\mathsf{i}\mathsf{u}}\end{array} \mathsf{R}\mathsf{A}\mathsf{D} \\ \hline \end{array} & \begin{array}{c} \mathsf{T} \mathsf{!P}(\mathsf{u},\mathsf{v}), \ \mathsf{i.e.}, & \overline{\begin{array}{c} a=\mathsf{P}(\mathsf{u},\mathsf{v})} \\ \overline{\begin{array}{c} \mathsf{v}\in\mathsf{a}}\mathsf{ }\mathsf{a}\mathsf{D} \\ \overline{\exists\mathsf{i}\mathsf{u}}\end{array} \mathsf{R}\mathsf{D} \\ \hline \end{array} \end{array}$$

**Lemma 11.2** = ℙ(, ) ⊢ = { | = ∨ =}*.*

*Proof*

$$\begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} a=\mu \lor \end{array} \boxed{\begin{array}{c} t=\mathtt{P}(\mu,\nu) \\ a\in\mathsf{I} \end{array}} \quad \dfrac{t=\mathtt{P}(\mu,\nu)}{a\in\mathsf{I}} \end{array} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{\mathtt{P}\cdot\mathtt{E}\_{1}} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{a\in\mathsf{I}} \end{array} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{\mathtt{P}\cdot\mathtt{E}\_{2}} \\ \begin{array}{c} \begin{array}{c} t=\mathtt{P}(\mu,\nu) \\ a\equiv\mathtt{P}(\mu,\nu) \end{array} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{a\in\mathsf{I}} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{a\in\mathsf{I}} \end{array} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{\mathtt{P}\cdot\mathtt{E}\_{3}} \; \dfrac{t=\mathtt{P}(\mu,\nu)}{a\in\mathsf{I}} \end{array} \end{array}$$

**Lemma 11.3** = { | = ∨ =}, ∃!, ∃! ⊢ = ℙ(, )*.*

*Proof* Abbreviate = ∨ = as Φ, where convenient, to reduce sideways spread.

 ={ | Φ} ∃! = Φ { }-EPM ∈ ={ | Φ} ∃! = Φ { }-EPM ∈ ={ | Φ} (1) ∈ { }-EMP = ∨ = (1) ℙ-I = ℙ(, ) □

#### **11.5 Pasigraph for singletons**

A special case of pairs ℙ(, ) arises when = . Here, the talk is of *singletons*. ℙ(, ) is often abbreviated as {}. We, however, shall introduce a special single operator , and write for {}.

The introduction and elimination rules for arise from the obvious simple modifcations of the rules for ℙ.

$$\begin{array}{cccc} & & \overline{a \in \mathfrak{t}} & ^{(i)} \\ \sigma \text{-I} & & \vdots & \quad \sigma \text{-E} & \frac{t = \sigma u}{u \in t} & \sigma \text{-E} & \frac{t = \sigma u}{u \in t} \\ & & \underline{u \in t} & \underline{a \in \mathfrak{t}} & & \end{array} \quad \begin{array}{cccc} \sigma \text{-E} & \frac{t = \sigma u}{u \in t} & \sigma \text{-E} & \frac{t = \sigma u}{u \in t} \\ & & & \sigma = u \end{array}$$

The next two Lemmas answer a query from Ethan Brauer: "Do the rules enable one to prove that ℙ(, ) is , given the existence of either one of them?".

*.*

$$\text{Lemma 11.4} \qquad \frac{\exists! \mathbb{P}(\mu, \mu)}{\mathbb{P}(\mu, \mu) = \sigma \mu}$$

*Proof*

∃!ℙ(, ) Ref= ℙ(, ) = ℙ(, ) ℙE<sup>1</sup> ∈ ℙ(, ) ∃!ℙ(, ) Ref= ℙ(, ) = ℙ(, ) (2) ∈ ℙ(, ) ℙE<sup>3</sup> = ∨ = (1) = (1) = (1) = (2) I ℙ(, ) = □

**Lemma 11.5** <sup>∃</sup>! = ℙ(, ) *.*

□

□

*Proof*

$$\begin{array}{llll} \frac{\Xi ! \sigma \mu}{\sigma \mu = \sigma \mu} \operatorname{Ref} =\\ \frac{\sigma \mu = \sigma \mu}{u \in \sigma \mu} \operatorname{\sigma \to\_{l}} & \frac{\sigma \mu = \sigma \mu}{u \in \sigma \mu} \operatorname{\frac{Ref}{\sigma \to\_{l}}} \end{array} \operatorname{Ref} = \begin{array}{llll} \frac{\Xi ! \sigma \mu}{\sigma \mu = \sigma \mu} \operatorname{Ref} =\\ \frac{\sigma \mu = \sigma \mu}{a \in \sigma \mu} \operatorname{\frac{Ref}{\sigma \to\_{l}}} \operatorname{\frac{a}{a} = u} \\ \hline a = u \lor a = u \end{array} (1)$$

#### **11.6 The binary predicate pasigraph** ∈ **2 for 'is a member of a member of'**

Now consider the simple notion that *t is a member of a member of u*. Let us use for this the (binary) *predicate* pasigraph

$$t \in {}^2 u\_-$$

Here are the introduction and elimination rules for this new pasigraph:

$$\epsilon \in \operatorname{\mathbf{\{1}} }^{2} \operatorname{\mathbf{\{1}} } \begin{array}{ccc} t \in \operatorname{\mathbf{\{1}}} & \operatorname{\mathbf{\{1}} } & \operatorname{\mathbf{\{1}} } & & \operatorname{\mathbf{\{1}}} & & \operatorname{\mathbf{\{1}}} & \operatorname{\mathbf{\{1}} \\ & & & & \operatorname{\mathbf{\{1}}} & & \operatorname{\mathbf{\{1}}} & & \operatorname{\mathbf{\{1}}} \\ & & & & & & \operatorname{\mathbf{\{1}}} & & & \operatorname{\mathbf{\{1}}} \\ & & & & & & & \operatorname{\mathbf{\{1}}} & & & \operatorname{\mathbf{\{1}}} \\ & & & & & & & & \operatorname{\mathbf{\{1}}} & & & & \operatorname{\mathbf{\{1}}} \\ & & & & & & & & \operatorname{\mathbf{\{1}}} & & & & & \operatorname{\mathbf{\{1}}} \\ & & & & & & & & & & \operatorname{\mathbf{\{1}}} & & & & & \operatorname{\mathbf{\{1}}} \\ & & & & & & & & & & & \operatorname{\mathbf{\{1}}} & & & & & & & \operatorname{\mathbf{\{1}}} \\ \end{array}$$

**Lemma 11.6** <sup>∈</sup> 2 ∃! *,* ∈ 2 ∃! *.*

*Proof*

$$\begin{array}{llll} & \overline{\frac{t \in a}{\mathsf{E} \ \mathsf{i} \ \mathsf{i} \ \mathsf{i} \ \mathsf{i} \ \mathsf{i} \ \mathsf{i}}}^{\mathsf{i}(\mathsf{i})} \overline{\frac{t \in a}{\mathsf{E} \ \mathsf{i} \ \mathsf{i}}}^{\mathsf{i}(\mathsf{i})} & & a \in \mathsf{u} \ \mathsf{i} \ \frac{\overline{t \in a}}{\mathsf{E} \ \mathsf{i} \ \mathsf{i}} \,\,^{\mathsf{i}(\mathsf{i})} \,\,^{\mathsf{i} \ \mathsf{i} \ \mathsf{i} \ \mathsf{i}} \,\,^{\mathsf{i}(\mathsf{i})} \,\,^{\mathsf{i}(\mathsf{i})} \end{array}$$

**Lemma 11.7** ∈ 2 ⊢⊢ ∃( ∈ ∧ ∈ )*.*

*Proof*

$$\begin{array}{c} \begin{array}{ccc} \stackrel{(1)}{\overline{a \in u}} & \stackrel{\overline{t \in a}}{\overline{t \in a}} & \stackrel{(1)}{\overline{a \in u}} & \stackrel{\overline{u} \in u}{\mid} \\ \stackrel{\overline{\exists!} a}{\exists! a} & \stackrel{\overline{t \in a} \land a \in u}{\exists x (t \in x \land x \in u)} & \stackrel{(1)}{\langle 1 \rangle} \\ \hline \exists \mathbf{x} (t \in \mathbf{x} \land \mathbf{x} \in u) & & \\ & & \stackrel{(2)}{t \in a \land a \in u} & \stackrel{\overline{t \in a}}{\overline{t \in a}} & \stackrel{\overline{a \in u}}{\xrightarrow{\overline{a} \in u}} \\ \stackrel{\overline{\exists! x (t \in x \land x \in u)}{\downarrow t \in^2} & & \stackrel{\overline{t \in^2}}{\downarrow t \in^2} & \stackrel{\overline{u}}{\langle 2 \rangle} \\ \end{array} \\ \begin{array}{c} (1) \\ \stackrel{(2)}{\overline{\exists! x (t \in x \land x \in u)}} \stackrel{(1)}{\overline{t \in a \land}} & \stackrel{\overline{a \in u}}{\xrightarrow{\overline{a}}} \\ \end{array} \end{array}$$

#### **11.7 Pasigraph for unions**

We shall next choose the Axiom of Unions to illustrate further the inferentialist's method of treating *operator* pasigraphs. The pasigraph Ð , for the Union of a given set, is *unary* (unlike the pasigraph ℙ for Pairing, which, as we have seen, is binary). Set-theorists handle the pasigraph Ð with ease, as though it were a familiar primitive expression of their language.

The genuinely primitive form of the Axiom of Unions:

$$\forall \mathbf{x} \exists! \{ \mathbf{y} \mid \exists z (\mathbf{y} \in z \land z \in \mathbf{x}) \}$$

can be re-written

∀∃! Ð 

provided only that Ð is furnished with Introduction and Elimination rules so that

$$\mathfrak{r} = \bigcup \mathfrak{v} \dashv \vdash \mathfrak{r} = \{ \mathfrak{y} \mid \exists z (\mathfrak{y} \in z \land z \in \mathfrak{v}) \}.$$

We choose to require the interdeducibility

$$\mathfrak{r} = \bigcup \mathfrak{v} \dashv \vdash \mathfrak{r} = \{ \mathfrak{y} \mid \exists z (\mathfrak{y} \in z \land z \in \mathfrak{v}) \}$$

rather than require the provability of the identity

$$\cup \mathcal{V} = \{ \mathbf{y} \mid \exists z (\mathbf{y} \in z \land z \in \mathcal{v}) \}$$

because we are working in a *free* logic. The latter identity commits one to the existence of the union of :

∃! Ð .

The interdeducibility, however, does not. It allows one to pin down the meaning of Ð as an operator on sets without committing one to its being everywhere (or indeed: *any*where) defned. One can grasp what Ð means without yet adopting the Axiom of Unions. And when we *do* adopt that axiom, it does not serve implicitly to defne the meaning of Ð . For that meaning will already have been defned by the following Introduction and Elimination rules for Ð .

$$\begin{array}{cccc} & & & (i) \ \overline{a \in ^2 \mathbb{v}} & \quad (i) \ \overline{a \in \, t} \\ \cup \text{ } & & & \overline{\,} & \overline{\,} \\ \hline \end{array}$$

$$\begin{array}{cccc} \cup \text{ } & \overline{\,} \| \, & \overline{\,} \| \, & \overline{a \in \, t} & \quad a \in ^2 \mathbb{v} \\ & \underline{\,} & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, \\ \cup \text{ } & \underline{\,} \text{ } & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, \\ \end{array}$$

$$\begin{array}{cccc} \cup \text{ } & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, \\ & & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, & \underline{\,} \| \, \\ \end{array}$$

**Lemma 11.8** *The operator pasigraph* Ð *obeys the Rule of Functional Denotation; that is, the following is provable:*

$$\frac{\exists!\stackrel{\Box}{\exists!u}}{\exists!u}.$$

*Proof*

$$\begin{array}{ll} \Xi! \bigcup \mu, \text{ i.e., } & \overline{a = \bigcup \mu} \stackrel{(1)}{\underset{\Xi: \text{!} \mu}{\rightleftarrows! \text{!} }} (1) \\ \hline \overline{\exists! \mu} \, ^{\overline{\text{::}} \mu} & (1) \\ & \overline{\exists! \mu} \end{array} (1) \begin{array}{ll} (1) \\ (1) \\ (1) \\ \overline{\exists! \mu} \end{array}$$

**Lemma 11.9** = Ð ⊢ = { | ∃( ∈ ∧ ∈ )}*.*

*Proof*

$$\underbrace{t = \bigcup\_{\begin{subarray}{c} t = \bigcup{\boldsymbol{v}} \\ t = \boldsymbol{\uptriangleleft} \end{subarray}}^{(1)} \frac{\overbrace{a \in \boldsymbol{b} \land b \in \boldsymbol{v}}^{(2)} \overbrace{a \in \boldsymbol{c}^{2} \lor \\ a \in \boldsymbol{\uptriangleleft} \end{subarray}}^{(2)} \frac{\overbrace{a \in \boldsymbol{c} \,}^{\boldsymbol{b}} \overbrace{b \in \boldsymbol{v}}^{(2)} \overbrace{\boldsymbol{c}^{2} \,}^{(2)}}\_{\boldsymbol{t}^{(1)} \, \overbrace{\boldsymbol{t} = \bigcup{\boldsymbol{v}} \} \vee \boldsymbol{v}}^{(1)} \underbrace{\overbrace{a \in \boldsymbol{c} \,}^{\boldsymbol{b}} \overbrace{a \in \boldsymbol{c} \,}^{(3)} \overbrace{\frac{a \in \boldsymbol{c}}{\exists \boldsymbol{1} c} \,}^{(3)} \overbrace{\frac{a \in \boldsymbol{c}}{\exists \boldsymbol{c} \,}}^{(3)} \overbrace{\boldsymbol{c} \,}^{(3)} \overbrace{\boldsymbol{c} \,}^{(3)}}\_{\boldsymbol{(2)} \, \overbrace{\boldsymbol{z} \,}^{(4)} \overbrace{\boldsymbol{z} \,}^{(4)} \overbrace{\boldsymbol{z} \,}^{(4)} \overbrace{\boldsymbol{z} \,}^{(4)} \overbrace{\boldsymbol{z} \,}^{(4)}}^{(4)}$$

**Lemma 11.10** = { | ∃( ∈ ∧ ∈ )}, ∃! ⊢ = Ð *.*

*Proof* Abbreviate ∃( ∈ ∧ ∈ ) as Φ, where convenient, to reduce sideways spread. Let Π<sup>1</sup> be the following fragment of the fnal proof that we shall construct. Note that the two discharge strokes labeled (2) are being put in place in advance of the eventual step labeled (2) (in the fnal proof) that will efect that discharge.

$$t = \{\mathbf{y} \mid \Phi\mathbf{y}\} \quad \frac{\stackrel{(2)}{a \in \,^2\mathbf{v}}}{\stackrel{(2)}{a \in \,^2\mathbf{v}}} \quad \frac{\begin{array}{c} \stackrel{(4)}{a \in \,^2c} \\ \stackrel{a \in \,^2c}{a \in \,^2c} \end{array} \quad \frac{\overline{a \in \,^2c}^{\quad(4)}}{\stackrel{a}{a \in \,^2c} \cdot \overline{a \in \,^2c}}^{(4)}}{\exists z (a \in z \land z \in \,\nu)} \quad \overline{\begin{array}{c} \stackrel{a \in \,^2\mathbf{v}}{a \in \,^2\mathbf{v}} \\ \stackrel{a \in \,^2\mathbf{v}}{\exists\,^2a} \end{array}} (2)$$

Now let Π<sup>2</sup> be the following fragment of the fnal proof that we shall construct. Note once again that the discharge stroke labeled (2) has already been put in place, in advance of the eventual step labeled (2) (in the fnal proof) that will efect that discharge.

$$\frac{t = \{\text{y} \mid \text{@y}\} \qquad \overline{a \in \text{t}}^{(2)} \quad (^{(2)})}{\underline{\text{@}a}} \quad \frac{a \in \text{b} \land b \in \text{v}}{a \in \text{b} \land b \in \text{v}} \quad \frac{^{(1)}\overline{a \in \text{b}} \quad \overline{b \in \text{v}} \stackrel{(1)}{\text{e}^{2} \cdot \text{E}}}{a \in ^{2}\text{v}} \quad (1)$$

Now we can form the fnal proof

Frege's Class Theory and the Logic of Sets 121

$$\frac{\text{tf} = \{\text{y} \mid \Phi\text{y}\}}{\begin{array}{c} \Xi \text{!} t \\ t = \bigcup \text{v} \end{array}} \quad \begin{array}{c} \Pi\_1 \qquad \Pi\_2 \\ a \in t \\ \hline \end{array} \\ \begin{array}{c} \Pi \\ a \in \Xi^2 \text{v} \\ \Box \\ \Box \end{array} (2) \begin{array}{c} \Box \\ \Box \\ \Box \end{array}$$

#### **11.8 Pasigraph for inclusion, or subset**

The notion ⊆ of inclusion is one of the most familiar and frequently used binary, but *ancillary*, or *defned*, relations in set theory. The usual reading is ' is a subset of '. The usual defnition in primitive vocabulary would be

$$
\mu \subset \mu \equiv\_{df} \forall \mathfrak{x} (\mathfrak{x} \in \mathfrak{t} \to \mathfrak{x} \in \mathfrak{u})\,.
$$

The inferentialist, however, working in free logic, lays down instead the following introduction and elimination rules for this pasigraph:

$$\begin{array}{ccccc} & & \overline{a \in \mathfrak{t}} \ & & \subseteq \operatorname{\mathbf{E}}\_{1} & \frac{\ t \subseteq \mathfrak{u}}{\exists!t} \\ \subseteq \operatorname{\mathbf{I}} & & & \vdots & & \\ & \underline{\exists!t} & \exists!u & \; a \in \mathfrak{u} & \\ & & \underline{t \subseteq \mathfrak{u}} & & \\ & & & & \subseteq \operatorname{\mathbf{E}}\_{3} & \frac{\ t \subseteq \mathfrak{u}}{\exists!u} \\ & & & & & \underline{v \subseteq \mathfrak{u}} \end{array}$$

That the predicate pasigraph ⊆ obeys the Rule of Atomic Denotation is obvious from ⊆E<sup>1</sup> and ⊆E2.

#### **11.9 The unary predicate pasigraph 'trans'**

A *transitive* set is one that contains as members all members of its members. That is, every member of a transitive set is a subset of it. Thus we have the following introduction rule:

trans-I ∃! () ∈ . . . ⊆ () trans()

matched by these elimination rules:

$$\mathsf{transs-E} \quad \frac{\mathsf{transs}(t)}{\exists!t} \qquad \frac{\mathsf{transs}(t) \quad \mathsf{u} \in t}{\mathsf{u} \subseteq t}$$

#### **11.10 Pasigraph for power sets**

We can now turn our attention to the Axiom of Power Sets to illustrate further the inferentialist's method for set theory.

We shall apply our earlier method to the unary operator-pasigraph which set-theorists handle with ease, as though it were a familiar primitive expression of their language. The genuinely primitive form of the Axiom of Power Sets:

$$\forall \mathbf{x} \exists! \{ \mathbf{y} \mid \forall z (z \in \mathbf{y} \to z \in \mathbf{x}) \}$$

i.e.,

∀∃!{ | ⊆}

can be re-written

∀∃!

provided only that is furnished with Introduction and Elimination rules so that

$$\mathfrak{r} = \mathcal{P}\mathbb{V} \dashv \vdash \mathfrak{r} = \{\mathfrak{y} \mid \forall z (z \in \mathfrak{y} \to z \in x)\}\dots$$

As with Unions, we choose with Power Sets to require the 'general term '-involving interdeducibility

$$\mathfrak{r} = \mathcal{GP}\mathfrak{v} \dashv \vdash \mathfrak{r} = \{\mathfrak{y} \mid \forall z (z \in \mathfrak{y} \to z \in \mathfrak{v})\}\mathfrak{r}$$

rather than require the provability of the identity

$$\mathcal{G}^{\rho}\nu = \{\mathbf{y} \mid \forall z (z \in \mathbf{y} \to z \in \nu)\},$$

because we are working in a *free* logic. The latter identity commits one to the existence of the power set of :

∃!.

The interdeducibility, however, does not. It allows one to pin down the meaning of as an operator on sets without committing one to its being everywhere (or indeed: *any*where) defned. One can grasp what means without yet adopting the Axiom of Power Sets. And when we *do* adopt that axiom, it does not serve implicitly to defne the meaning of . For that meaning will already have been defned by the following Introduction and Elimination rules for .


Frege's Class Theory and the Logic of Sets 123

$$\begin{array}{cccc} \mathcal{GP}\text{-}\mathrm{E}\_{3} & \frac{t=\mathcal{GP}\nu \qquad \mu \subseteq \nu}{\mu \in t} & & & \mathcal{GP}\text{-}\mathrm{E}\_{4} & \frac{t=\mathcal{GP}\nu \qquad \mu \in t}{\mu \subseteq \nu} \\\\ \mathbf{1.11} & \frac{\mu \in \mathcal{GP}\nu}{\exists 1\nu \end{array}$$

**Lemma 11.11** <sup>∈</sup> ∃!

*Proof*

$$\begin{array}{c} \mathsf{E}!\mathcal{GP}\mathsf{v}, \ \mathsf{i.e.}, \ \begin{array}{c} \overline{a=\mathcal{GP}\_{\mathsf{V}}} \ \stackrel{(1)}{\mathcal{P}\mathsf{i.e}\_{2}}\\ \overline{\exists!\mathcal{V}} \end{array} (1) \\\ \begin{array}{c} \Box \mathsf{x} = \mathcal{GP}\mathsf{v} \\ \Box \mathsf{i} \mathsf{v} \end{array} (1) \end{array} (2) \begin{array}{c} (1) \\ \Box \mathsf{i} \mathsf{v} \\ \Box \mathsf{i} \end{array} \tag{1}$$

$$\text{Lemma 11.12} \quad \frac{\exists! \mathcal{P}\_{\mathcal{V}} \quad u \subseteq \mathcal{V}}{u \in \mathcal{P}\_{\mathcal{V}}}.$$

*Proof*

$$\frac{\exists! \mathcal{GP}\nu, \text{ i.e., } \quad \overline{a = \mathcal{GP}\nu}^{\text{(1)}} \quad \begin{array}{c} \overline{a = \mathcal{GP}\nu}^{\text{(1)}} \quad \mu \subseteq \nu \\\ \mu \in \mathcal{GP}\nu \end{array} \begin{array}{c} \mathcal{O}\text{-E}\_3 \\\ \mathcal{O}\text{-E}\_3 \end{array}$$

$$\text{Lemma 11.13} \quad \frac{\mu \in \mathcal{P}\_V}{\mu \subseteq \nu}$$

*Proof*

$$\begin{array}{ll} \frac{\mu \in \mathcal{\mathcal{P}}\nu}{\exists! \mathcal{GP}\nu, \text{ i.e.,}} & \frac{\mu = \mathcal{GP}\nu}{\mu = \mathcal{GP}\nu}^{(1)} & \mu \in \mathcal{GP}\nu\\ \frac{\exists! \times x = \mathcal{GP}\nu}{\mu \subseteq \nu}^{\quad(1)} & (1) & \end{array}$$

**Lemma 11.14** = ⊢ = { | ∀( ∈ → ∈ )}*.*

*.*

*Proof*

$$\underbrace{t \stackrel{(3)}{=} \stackrel{\scriptstyle \overline{\forall}!v}{\stackrel{\scriptstyle \overline{\forall}!a}{=}}^{(3)} \stackrel{\scriptstyle \overline{\forall}!v}{\stackrel{\scriptstyle \overline{\forall}!v}{\stackrel{\scriptstyle \overline{\forall}!v}{\stackrel{\scriptstyle \overline{\forall}!v}{\stackrel{\scriptstyle \overline{\forall}!v}{\stackrel{\scriptstyle \overline{\forall}!a}{}}}}^{(3)} \stackrel{\scriptstyle \overline{\forall}!a}{\stackrel{\scriptstyle \overline{\forall}!c}{}} \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!c}{}}^{(3)} \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!c}{}} \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!c}{}} \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!a}{}} \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!a}{}} \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!a}{}} \stackrel{\scriptstyle \overline{\forall}!c}{} (2)$$
 
$$\begin{array}{llllll} a \stackrel{\scriptstyle a \in \scriptstyle t}{\stackrel{\scriptstyle t}{\stackrel{\scriptstyle \overline{\forall}!t}{}}} & & a \stackrel{\scriptstyle \overline{\forall}!c}{\stackrel{\scriptstyle \overline{\forall}!a}{}} & \overline{\forall}!a \stackrel{\scriptstyle \overline{\forall}!c}{} & \overline{\forall}!c \end{array} (2)$$

**Lemma 11.15** ∃!, = { | ∀( ∈ → ∈ )} ⊢ = *.*

*Proof* Abbreviate ∀( ∈ → ∈ ) by Φ and ∀( ∈ → ∈ ) by Φ, where convenient, to reduce sideways spread. Moreover, let Π be

$$\begin{array}{c} \stackrel{(1)}{\begin{array}{c} \begin{array}{c} a \in \textit{t} \\ b \in \textit{a} \end{array} \end{array}} \quad \stackrel{(1)}{\begin{array}{c} \begin{array}{c} a \in \textit{t} \\ b \in \textit{a} \end{array} \end{array}} \quad \stackrel{\scriptstyle \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \\ b \in \textit{a} \end{array} \end{array} \end{array} \end{array} \quad \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \\ b \in \textit{a} \\ b \end{array} \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \\ b \in \textit{a} \\ b \end{array} \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \\ b \in \textit{a} \\ b \end{array} \end{array} \end{array} \end{array}$$

Then form the proof

$$t = \underbrace{\{\mathbf{y} \mid \Phi\mathbf{y}\} \qquad \underbrace{\begin{array}{c} \overline{a \subseteq \mathbf{v}} \stackrel{(1)}{} \quad \overline{b \in a} \quad \overline{b \in a} \quad \overset{(1)}{} \\ \overline{b \in v} \quad \underline{b \in v} \quad \underline{c \in b} \quad \overline{c \in b} \quad \end{array}}\_{\boxed{t = \{\mathbf{y} \mid \Phi\mathbf{y}\}} \qquad \underbrace{\begin{array}{c} \underline{a \subseteq \mathbf{v}} \stackrel{(1)}{} \\ \overline{a \subseteq \mathbf{v}} \quad \underline{a \subseteq \mathbf{v}} \quad \underline{c \in b} \quad \end{array}}\_{\boxed{\mathbf{f} = \mathbf{f} \;} \qquad \underline{\begin{array}{c} \underline{a \subseteq \mathbf{v}} \stackrel{(1)}{} \\ \underline{\mathbf{f} \mid \mathbf{F} \| \mathbf{g}\|} \quad \Pi \quad \underline{\begin{array}{c} (1) \\ \underline{\mathbf{T}} \end{array} \quad \underline{\mathbf{T}} \quad \underline{\begin{array}{c} (1) \\ \underline{\mathbf{T}} \end{array} \quad \underline{\mathbf{T}} \quad \underline{\mathbf{T}} \quad \underline{\mathbf{T}} \quad \underline{\mathbf{T}} \end{array}}$$

**Lemma 11.16** ∃!(), trans() ⊢ trans(())*.*

*Proof*

$$\begin{array}{c} \mathsf{T} \stackrel{\scriptstyle \mathsf{T} \in \mathcal{G} \mbox{\(}t\)}{\displaystyle} \begin{array}{c} \mathsf{trans}(t) \\ \underline{c \in t} \end{array} \xrightarrow{\scriptstyle (1)} \begin{array}{c} \overline{d \in \mathcal{G} \mbox{\(}t\)} \mbox{\(}^{(2)} \\ c \in t \end{array} \mbox{\(}^{\mathrm{(}1)} \begin{array}{c} \overline{d \in \mathcal{G} \mbox{\(}t\)} \mbox{\(}^{(2)} \\ c \in t \end{array} \\ \underline{c \in \mathcal{G} \mbox{\(}t\)} \mbox{\(}^{(1)} \mbox{\(}^{(1)} \mbox{\)} \\ \underline{d \in \mathcal{G} \mbox{\(}^{(2)} \mbox{\(}^{(2)} \mbox{\)}\mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)}} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\end{\(}}^{(2)} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\mbox{\(}^{(2)}} \mbox{\)} \mbox{\(}^{(2)} \mbox{\)} \mbox{\$$

#### **11.11 The binary predicate pasigraph for 'is disjoint from'**

We now introduce a new binary-relation pasigraph: is to mean that is disjoint from — that is, they have no member in common. This pasigraph will be useful in the formulation of the Axiom (or Rule) of Regularity. The introduction and elimination rules for are as follows.


**Lemma 11.17** ⊢⊢ ∩ = ∅*.*

#### **11.12 Pasigraph for ranges**

Suppose has the variables , free, and is functional from to , at least for in . The Replacement Pasigraph [] can then be explicitly defned as follows.

Frege's Class Theory and the Logic of Sets 125

$$
\varphi\_{\mathcal{X}^\circ}[t] =\_{df} \{ \mathbf{y} \mid \exists \mathbf{x} (\mathbf{x} \in t \land \varphi) \}.
$$

The subscripting with and registers the fact that this pasigraph binds those two variables.

[The Axiom Scheme of] Replacement, due to Fraenkel, can be formulated as the following 'conditional existence' rule (where '∃1' is the uniqueness quantifer):

$$\begin{array}{c} \begin{array}{c} \hline \hline a \in t \\ \vdots \\ \vdots \\ \hline \end{array} \end{array}^{(i)}$$
 
$$\begin{array}{c} \exists\_{1} \text{y} \varphi a \text{y} \\ \hline \exists! \varphi\_{xy}[t] \end{array} (i)$$

The pasigraph could also be taken as a grammatical primitive, furnished with the following rules of introduction and elimination.

. . . [. . .]-Intro () ∈ . . . ∈ () ∈ . . . () ∈ , () | {z } . . . ∈ () ∈ , () , () | {z } . . . = () = []

Note that the frst two subproofs call for some same term in their conclusions. This means that the elimination rule will have a part that corresponds to these two subproofs taken together. The fnal subproof ensures the functionality of on as its domain.

$$\begin{array}{ccccc} \dots \underbrace{[\dots]}\_{} \text{-Elim} & & & \underbrace{\stackrel{(i)}{\vdots} \overleftarrow{a \in \mathfrak{t}} \text{ }}\_{}, \underbrace{\stackrel{(i)}{\varphi} \text{au} \mu \text{u}}\_{} \\\\ \frac{\text{v} = \varphi\_{\text{xy}}[\mathfrak{t}] \quad \quad \text{u } \mathfrak{e} \in \text{v} & & \theta \\\\ \frac{\text{v} = \varphi\_{\text{xy}}[\mathfrak{t}] \quad \quad \textit{u} \in \mathfrak{t} \quad \quad \varphi \mu \text{uv} & \quad \underline{\text{v} = \varphi\_{\text{xy}}[\mathfrak{t}] \quad \quad \textit{u} \in \mathfrak{t} \quad \quad \varphi \mu \text{v} & \quad \varphi \mu \text{uv} \\\\ \frac{\text{v}}{} \text{u } \underline{\text{v}} \text{v} \end{array}$$

In set theory, the successor of a set is defned as ∪ {} — or, in the notation we have thus far introduced, as Ð ℙ(, ). We shall now introduce a unary operator pasigraph s to represent successor, and furnish it with introduction and elimination rules that secure for it the same meaning.

$$\begin{array}{cccc} & & \overline{a \in u} & (i) \\ \mathsf{sI} & & \vdots & \vdots \\ & \underline{u \in t} & \underline{a \in t} & b \in u \lor b \equiv u} \\ & & \underline{v \in t} & \underline{v \in \mathsf{su}} \\ \mathsf{sE} & \underline{t = \mathsf{su}} & \underline{t = \mathsf{su}} & \underline{v \in t} & \underline{t = \mathsf{su}} & \underline{v \in t} \\ \end{array} (i)$$

#### **Lemma 11.18** = ∪ {} ⊢⊢ = Ð ℙ(, )*.*

#### **11.13 The unary predicate pasigraph 'comp'**

comp(, ) is to mean that and are comparable in terms of the membership relation:

$$\mathsf{comp}(t,\mathsf{u}) \equiv\_{df} t \in \mathsf{u} \lor t = \mathsf{u} \lor \mathsf{u} \in \mathsf{t}.$$

Introduction and elimination rules that secure this meaning directly are as follows.

$$\begin{array}{ccccc} \mathsf{comp-I} & \frac{t \in u}{\mathsf{comp}(t, u)} & \frac{t = u}{\mathsf{comp}(t, u)} & \frac{u \in t}{\mathsf{comp}(t, u)} \\\\ & \overline{t \in u}^{(i)} & \overline{t = u}^{(i)} & \overline{u \in t}^{(i)} \\\\ \mathsf{comp-E} & \frac{\cdot}{\theta}^{\mathsf{comp}(t, u)} & \theta/\bot & \theta/\bot & \theta/\bot \\ & \overline{\theta/\bot}^{\mathsf{endp}(i)} \end{array}$$

#### **11.14 The unary predicate pasigraph 'conn'**

conn() is to mean that is connected by the membership relation:

$$\mathsf{comn}(\mathfrak{t}) \equiv\_{df} \forall \mathsf{x} \forall \mathsf{y} ((\mathsf{x} \in \mathsf{t} \land \mathsf{y} \in \mathsf{t}) \to (\mathsf{x} \in \mathsf{y} \lor \mathsf{x} = \mathsf{y} \lor \mathsf{y} \in \mathsf{x})) .$$

Introduction and elimination rules that secure this meaning directly are as follows.

$$\begin{array}{c} \mathsf{com}\mathsf{L}\mathsf{I} \\ \mathsf{com}\mathsf{L}\mathsf{I} \\ \mathsf{com}\mathsf{L}\mathsf{L} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} (i) \\ \begin{array}{c} a \equiv t \\ \end{array} \\ \end{array} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \end{array} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \end{array} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \end{array} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \end{array} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,b) \\ \end{array} \end{array} \begin{array}{c} \begin{array}{c} \mathsf{com}\mathsf{p}(a,\upsilon) \\ \end{array} \end{array} \end{array}$$

#### **11.15 The unary predicate pasigraph 'O' (for 'is an ordinal')**

O() is to mean that is an ordinal. The introduction and elimination rules are as follows.

Frege's Class Theory and the Logic of Sets 127

$$\begin{array}{cccc} \mathbf{O-I} & \mathbf{t} \text{ans}(t) & \mathbf{\color{red}{\mathbf{O}(t)}} & \mathbf{\color{red}{\mathbf{O}(t)}} & \mathbf{\color{red}{\mathbf{O}(t)}} & \mathbf{\color{red}{\mathbf{O}(t)}} \\ \mathbf{O-I} & \mathbf{\color{red}{\mathbf{O}(t)}} & \mathbf{\color{red}{\mathbf{O}(t)}} & \mathbf{\color{red}{\mathbf{O}(t)}} \end{array}$$

#### **11.16 The unary predicate pasigraph 'IoS' (for 'is an initial or successor ordinal')**

IoS() is to mean that is an *initial or successor* ordinal. The introduction and elimination rules are as follows.

$$\begin{array}{ccccc} \mathsf{I} & \mathsf{I} \mathsf{S} \mathsf{S} \mathsf{I} & \mathsf{I} \mathsf{S} \mathsf{S} (\mathsf{t}) & \mathsf{I} \mathsf{S} \mathsf{S} (\mathsf{t}) \\ \mathsf{I} \mathsf{S} \mathsf{S} \mathsf{-I} & \overline{\mathsf{I} \mathsf{S} \mathsf{S} (\mathsf{t})} & \mathsf{I} \mathsf{S} \mathsf{S} (\mathsf{t}) \\ & & & \mathsf{I} \mathsf{S} \mathsf{S} (\mathsf{t}) & \mathsf{I} \mathsf{S} \mathsf{S} \mathsf{0} \\ & & & \mathsf{I} \mathsf{S} \mathsf{S} (\mathsf{t}) & \mathsf{I} \mathsf{S} \mathsf{S} \mathsf{0} \\ & & & & \mathsf{I} \mathsf{S} \mathsf{A} \mathsf{S} \mathsf{A} \\ \end{array} \qquad \begin{array}{c} \begin{array}{c} (i) \ \mathsf{S} \mathsf{S} \mathsf{A} \end{array} \qquad \begin{array}{c} (i) \ \mathsf{S} \mathsf{S} \mathsf{A} \end{array} \qquad \begin{array}{c} (i) \ \mathsf{S} \mathsf{S} \mathsf{A} \end{array} \qquad \begin{array}{c} (i) \ \mathsf{S} \mathsf{S} \mathsf{A} \end{array} \qquad \begin{array}{c} (i) \ \mathsf{S} \mathsf{A} \mathsf{S} \mathsf{A} \end{array} \qquad \begin{array}{c} (i) \ \mathsf{S} \mathsf{A} \mathsf{S} \mathsf{A} \end{array} \} \end{array}$$

#### **11.17 The unary predicate pasigraph 'fO' (for 'is a fnite ordinal')**

fO() is to mean that is a *fnite* ordinal. The introduction and elimination rules are as follows.

$$\begin{array}{cccc} & & \overline{a \in \mathfrak{t}} & ^{(i)} \\ \mathsf{f}\mathsf{f}\mathsf{O}\cdot\mathsf{I} & & \vdots & \\ \mathsf{f}\mathsf{O}\cdot\mathsf{I} & \mathsf{lo}\mathsf{S}(t) & \mathsf{lo}\mathsf{S}(a) \\ \mathsf{f}\mathsf{O}(t) & & & \end{array} \quad \begin{array}{cccc} & & \mathsf{f}\mathsf{O}\cdot\mathsf{E} & \mathsf{f}\mathsf{O}(t) \\ \mathsf{f}\mathsf{O}\cdot\mathsf{E} & \mathsf{f}\mathsf{O}(t) & \mathsf{f}\mathsf{O}(t) & \mathsf{f}\mathsf{O}(t) & u \in t \\ \end{array}$$

#### **11.18 The constant pasigraph**

The set of fnite ordinals:

$$
\omega =\_{\mathcal{df}} \{ x \mid \mathsf{fO}(x) \}
$$

is the canonical choice among set theorists of a *countably infnite* set. The Axiom of Infnity is usually formulated as the statement that exists:

∃!.

**Theorem 11.19** ∃! ⊢ = Ð *.*

*Proof* The following proof uses the new rules for Ð , ⊆, , and . It also appeals to the existence of singletons, and the existence of power sets.

$$\begin{array}{c} \frac{\frac{\overline{\Box}!t}{\overline{\Box}!\mathscr{P}t}}{\frac{\overline{\Box}!t}{\overline{\Box}!\mathscr{P}t}} \quad \frac{\frac{\overline{\Box}!t}{\overline{\Box}!\mathscr{P}c}}{\frac{\overline{\Box}!\mathscr{P}t}{\overline{\Box}!\mathscr{P}t}} \quad \frac{\frac{\overline{\Box}!t}{\overline{\Box}!\mathscr{P}c}}{\frac{\Gamma!t}{\overline{\Box}!\mathscr{P}t}} \quad \frac{\frac{\overline{\Box}!t}{\overline{\Box}!\mathscr{P}t}}{\frac{\Gamma!\mathscr{P}t}{\overline{\Box}!\mathscr{P}t}} \quad \frac{\frac{\overline{\Box}!t}{\overline{\Box}!\mathscr{P}t}}{\frac{\Gamma!\mathscr{P}t}{\overline{\Box}!\mathscr{P}t}} \quad \frac{\frac{\Gamma!t}{\overline{\Gamma}!\mathscr{P}t}}{\frac{\Gamma!\mathscr{P}t}{\Gamma!\mathscr{P}t}} \quad \frac{\Gamma!t}{\Gamma!t} \\ \frac{\overline{\Box}!t}{\Gamma!\mathscr{P}t} \quad \frac{\Gamma!\mathscr{P}t}{\Gamma!\mathscr{P}t} \quad \frac{\Gamma!\mathscr{P}t}{\Gamma!\mathscr{P}t} \quad \frac{\Gamma!\mathscr{P}t}{\Gamma!\mathscr{P}t} \quad \frac{\Gamma!\mathscr{P}t}{\Gamma!\mathscr{P}t}}{\Gamma!\mathscr{P}t} \quad \frac{\Gamma!\mathscr{P}t}{\Gamma!\mathscr{P}t} \end{array}$$

We do *not* have the operator-commutation of Theorem 11.19, namely ∃! ⊢ = Ð . Here is a counterexample: *Take* = {{}}*, where* ∃!*, whence* ∃!*. Then* Ð = {}*. So* Ð = {∅, {}} ≠ {{}} = *.*

#### **12 New rules for identity and existence in free logic**

In our deployment of free frst-order logic thus far, we have used the abbreviation ∃! to express the thought that exists, or is defned. Moreover, ∃! has been taken as a mere abbreviation of the longer, 'ofcial' sentence ∃ =.

We shall now provide new introduction and elmination rules for the *separate* expressions !, =, ∃ and ∀, which do a better job of capturing their logical roles.

The formal sentence ! will now, ofcially, be a well-formed sentence produced by our logical grammar. ! will *take over* the role formerly played by ∃!. Since ! will be treated as a *primitive* expression, ! will *not* be a defnitional abbreviation.

! means " exists", or, as mathematicians often put it, " is defned".

#### **12.1 Introduction and elimination rules for !**

! has two parts to its introduction rule:

$$\frac{A(\dots t \dots \dots)}{\text{!} }, \text{ where } A \text{ is an atomic predicate;} \qquad \frac{\text{!} f(\dots t \dots \dots)}{\text{!} t} \dots$$

So: ! is a consequence of any atomic fact involving (the denotation of) ; and is a consequence also of the existence of any function's yielding a value on arguments among which is (the denotation of) .

When a proposition can be inferred from each of such a wide range of propositions, it must be extremely weak; and its own consequences will be at least as weak.

So when the question arises: *What might legitimately be inferred from* !*?*, given its own two-part introduction rule, the answer must be: an atomic proposition, involving as its only constituent term, that is bound to be true no matter what 'positive' atomic facts might obtain (involving the denotation of ), and no matter what mappings might be efected by what functions involving the denotation of as an input. An excellent candidate for such an atomic proposition would be =. The Elimination Rule for ! is, accordingly:

!E ! = .

#### **12.2 Introduction and elimination rules for** =

=I () () . . . () ! ! () = , where is parametric; =E = () () = ! = ! .

Note how the last two parts of =E are already covered by the rule !I — since identity statements are atomic.

#### **Theorem 12.1** ! ⊢ =*.*

$$\text{Proof}\qquad \overline{\frac{\overline{F}t}{t}}^{\left(i\right)}\quad \stackrel{\text{!}t}{\left(t\right)}\quad \stackrel{\text{!}t}{=}\text{!}\tag{7}$$

It is interesting that this derived result using =I simply *is* the rule !E.

$$\begin{array}{ccccccccc} \Delta & \Delta & \Delta & & \Delta & & \Delta\\ \hline \hline \Pi & & \Pi & & \Pi & & \Pi & & \Pi\\ \hline \hline \frac{\mathsf{H}(t)}{\mathsf{H}} & \mathsf{H} & \frac{\mathsf{H}(t)}{\mathsf{H}} & \mathsf{H} & & & & \Pi & \Pi\\ \hline \end{array} \\ \begin{array}{ccccc} \Delta & & \Delta & & \Delta & & \Delta\\ \hline \hline \Pi & & \Pi & & \Pi & & \Pi\\ \hline \end{array}$$

We see, then, that the two inferences

$$\frac{A(t)}{t=t} \quad \text{and} \quad \frac{!f(t)}{t=t}$$

have normal proofs. We shall henceforth adopt them as primitive inferences, while mindful that they are actually derived rules.

With ! as our preferred way of expressing the existence or defnedness of , our earlier proof of the sequent ∃! : = Ð can be rewritten as follows:

$$\underbrace{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{t}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}}\_{\begin{array}{c} \frac{!}{t}\\ \end{array}} \underbrace{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}}\_{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}} \underbrace{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}}\_{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}} \underbrace{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}}\_{\begin{array}{c} \frac{!}{\overline{c\in\mathsf{Pt}}}\\ \end{array}}(1)$$

#### **12.3 Some results for !,** =**,** ∃**, and** ∀

∀ =

**Theorem 12.2** ! ⊢ ∃ =*. Proof* ! ! !E = ∃I ∃ = □ **Theorem 12.3** ∃ = ⊢ !*. Proof* ∃ = (1) = !I ! (1) ∃E ! □ **Theorem 12.4** ⊢ ∀ !*. Proof* (1) ! (1) ∀I ∀ ! □ **Theorem 12.5** ⊢ ∀ =*. Proof* (1) ! !E = (1) ∀I □

#### **12.4 Using the new pasigraph ! to reformulate the introduction and elimination rules for set-theoretic pasigraphs**

**The rules for set-theoretical pasigraphs, rewritten:** ∅**.**

$$\begin{array}{ccccc} & & \overline{a \in \mathfrak{t}} & ^{(i)} \\ \n\ndownarrow & & \vdots & & \mathfrak{0}\mathbb{E} & \frac{t=\emptyset}{\cdot \cdot} & \xrightarrow{t=\emptyset} & \underline{u \in \mathfrak{t}} \\ & \underline{!t} & \perp & & \downarrow \\ & & \underline{!t-\emptyset} & & & \end{array}$$

**The rules for set-theoretical pasigraphs, rewritten:** ⊆**.**

$$\begin{array}{ccc} \begin{array}{ccc} & & \overline{a \in \, t} \\ \subseteq \mathrm{I} & & \vdots \\ & \underline{!t} & \, \mathrm{u} \in \, u \\ \end{array} & \begin{array}{ccc} \begin{array}{ccc} \subseteq \mathrm{E}\_{1} & \, \frac{t \subseteq \, u}{\cdot \mathrm{I}} \\ \end{array} \\ \end{array} & \begin{array}{ccc} \subseteq \mathrm{E}\_{1} & \, \frac{t \subseteq \, u}{\cdot \mathrm{I}} \\ \end{array} \\ \end{array} \\ & \begin{array}{ccc} \subseteq \mathrm{E}\_{2} & \, \frac{t \subseteq \, u}{\cdot \mathrm{I}} \\ \end{array} \\ \end{array}$$

**The rules for set-theoretical pasigraphs, rewritten:** ℙ**.** These rules are unchanged.

ℙ-I ∈ ∈ () ∈ . . . = ∨ = () =ℙ(, )

Frege's Class Theory and the Logic of Sets 131

$$\mathbb{P}\text{-}\mathbf{E}\_{1} \quad \frac{t=\mathbf{P}(\boldsymbol{\mu},\boldsymbol{\nu})}{\boldsymbol{\mu}\in t} \qquad \mathbb{P}\text{-}\mathbf{E}\_{2} \quad \frac{t=\mathbf{P}(\boldsymbol{\mu},\boldsymbol{\nu})}{\boldsymbol{\nu}\in t} \qquad \mathbb{P}\text{-}\mathbf{E}\_{3} \quad \frac{t=\mathbf{P}(\boldsymbol{\mu},\boldsymbol{\nu}) \quad \boldsymbol{\nu}\in t}{\boldsymbol{\nu}=\boldsymbol{\nu}\lor\boldsymbol{\nu}=\boldsymbol{\nu}}$$

**The rules for set-theoretical pasigraphs, rewritten: .** These rules are unchanged.

$$\begin{array}{ccccc} & & \overline{a \in \, t} \; ^{(i)} & & \sigma \text{-} \mathbf{E}\_{1} & \frac{t = \sigma u}{u \in t} \\ \sigma \text{-} & & \vdots & & \\ & \underline{u \in t} \; \begin{array}{ccccc} \underline{u \in t} & \underline{a \in u} & \underline{b \in u} \\ \underline{t} & & \sigma \text{-} \mathbf{E}\_{2} & \frac{t = \sigma u}{w \, u} \\ \end{array} & & \underline{t = \sigma u} & \underline{w \in t} \\ \end{array}$$

**The rules for set-theoretical pasigraphs, rewritten:** ⊆**.**

Ð -I ! ! () ∈ , () ∈ | {z } . . . ∈ () ∈ . . . ∈ () ∈ . . . ∈ () = Ð , ,, parametric Ð -E<sup>1</sup> = Ð ∈ ∈ Ð -E<sup>3</sup> = Ð ! Ð -E<sup>2</sup> = Ð ∈ () ∈ , () ∈ | {z } . . . () , parametric; Ð -E<sup>4</sup> = Ð !

#### **The rules for set-theoretical pasigraphs, rewritten: Separation.**

$$\begin{array}{ccccc} & & (^{(i)} \underbrace{\sqrt{x\_a^{\mathbf{x}}}}\_{a} \quad \overbrace{a \in \mathbf{v}}^{\mathbf{x}} \quad ^{(i)} & & (^{(i)} \overbrace{a \in \mathbf{v}}^{\mathbf{x}} \quad ^{(i)} & \overbrace{a \in \mathbf{v}}^{\mathbf{x}} \\ & & \vdots & & \vdots & \vdots \\ & & \underbrace{a \in \mathbf{v}}\_{i} & ! t & \mathbf{a} \in \mathbf{v} & \underbrace{\varphi\_{a}^{\mathbf{x}}}\_{(i)} \quad (i) \\\\ \text{Elim} & \underbrace{t = \{\mathbf{x} \in \mathbf{v} \mid \varphi\}}\_{\mathbf{u} \in t} \quad \varphi\_{u}^{\mathbf{x}} \quad \mathbf{u} \in \mathbf{v} & \underbrace{t = \{\mathbf{x} \in \mathbf{v} \mid \varphi\}}\_{\mathbf{l}t} \\\\ & & t = \{\mathbf{x} \in \mathbf{v} \mid \varphi\} \quad \mathbf{u} \in t & \mathbf{t} = \{\mathbf{x} \in \mathbf{v} \mid \varphi\} \quad \mathbf{u} \in t \end{array}$$

 

**The rules for set-theoretical pasigraphs, rewritten: .**

∈

$$\begin{array}{c} \begin{array}{c} (i) \ \overline{\begin{array}{c} a \subseteq \mathsf{v}} \mathsf{v} \end{array} \quad (i) \ \overline{\begin{array}{c} a \in \mathsf{t} \\ \end{array} \end{array} \end{array} \begin{array}{c} (i) \ \overline{\begin{array}{c} a \in \mathsf{t} \\ \end{array} \end{array} \quad (ii) \ \overline{\begin{array}{c} a \in \mathsf{t} \\ \vdots \\ \end{array} \end{array} \quad (i) \ \begin{array}{c} \begin{array}{c} a \in \mathsf{t} \\ \end{array} \end{array} \quad (iii) \ \begin{array}{c} \begin{array}{c} a \subseteq \mathsf{t} \\ \end{array} \end{array} \begin{array}{c} \begin{array}{c} \\ a \subseteq \mathsf{v} \\ \end{array} \end{array} \quad (iv) \ \end{array}$$

132 Neil Tennant

() 

$$\begin{array}{ccccc} \mathcal{GP}\text{-}\mathrm{E}\_{1} & \frac{t = \mathcal{GP}\mathrm{v}}{\mathrm{!}\!\!t} & & & \mathcal{GP}\mathrm{-}\mathrm{E}\_{2} & \frac{t = \mathcal{GP}\mathrm{v}}{\mathrm{!}\!\!\!v} \\\\ \mathcal{GP}\text{-}\mathrm{E}\_{3} & \frac{t = \mathcal{GP}\mathrm{v} & \mu \subseteq \nu}{\mu \in \mathrm{t}} & & & \mathcal{GP}\mathrm{-}\mathrm{E}\_{4} & \frac{t = \mathcal{GP}\mathrm{v} & \mu \in \mathrm{t}}{\mu \subseteq \nu} \\\\ \end{array}$$

**The rules for ontologially committal — and** *classical* **— set-theory, rewritten.**

*Extensionality* This is now derivable, given the I- and E-rules for { | . . . . . .}.

*Existence of Empty Set*

#### !∅

*Separation Schema*

$$\frac{\mathfrak{l}}{!\{\mathfrak{x}\in t \mid \varphi\}}$$

*Pairing* ! ! !ℙ(, )

> *Union* ! ! Ð

*Power Set* ! !

 ∈ , | {z } . .

*Regularity, or Foundation*

$$\frac{u \in t \qquad \qquad \qquad \theta}{\theta}\_{(i)}$$

()

*Replacement Schema*

$$\begin{array}{c} \overbrace{a \in t}^{(i)}(i) \\ \vdots \\ \overbrace{\exists\_{1} \text{y} \varphi}^{}(i) \\ \end{array}^{(i)}(i) $$

![]

*Infnity*

!

*Choice* (∀ ∈ ) (∃ ∈ )(, ) (∃ : ↦→ ) (∀ ∈ )(, )

## **13 Concluding remarks**

It is important to remind the reader of the methodological underpinnings of this study. We have sought to illuminate the meanings of set-theoretical expressions in such a way as to secure agreement on those meanings from classicists, intuitionists, and constructivists alike. This we have done by laying down ontologically non-committal rules for set-theoretical expressions, no matter whether they are, conventionally, either primitive or defned. This captures the 'analytical core' of set-theoretical talk — what Quine once called 'virtual set theory'. Interestingly, all the proofs thus far involved in delivering this analytical core are proofs in Core Logic.

It is then a *further* question what sets actually exist— either outright or conditionally. Two simple examples, respectively, will illustrate this. That the empty set exists is an *outright* existential assertion. That, given any two sets, their pair set exists, is a *conditional* existential assertion. On such simple, fnitistic assertions it is no surprise

that no theorist from any of the competing camps — classical, intuitionistic, or constructivist — demurs. Disagreements arise only when one begins to deal with such matters as *completed infnities*; sets being specifed by means of *efectively undecidable* formulae; sets being specifed by means of *impredicative* formulae; and/or whether there can be sets answering to formulae whose extensions would be *too extensive*. Constructive set theorists also have to be vigilant about their choices of constructively distinguishable (i.e., non-equivalent) formulations of axioms or axiom-schemes that the classicist is able to regard as equivalent (possibly, *modulo* other, 'more basic', or 'secure' axioms already laid down). Such is the case with various possible forms of the Axiom of Choice; of the Axiom of Regularity, or Foundation; and (so this author contends) with the Axiom Scheme of Separation.15

In the conduct of the further investigations to which these latter considerations give rise, the present author ofers the *common parlance* of the pasigraphs treated above. They provide the *lingua franca* within which classicists, intuitionists, and constructivists can subsequently disagree, or agree to difer, given their respective doctrinal grounds concerning the nature of mathematical existence, the bivalence of mathematical truth, whether such truth is epistemically constrained, etc.

#### **References**


<sup>15</sup> See Tennant (2020; 2021).


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **The Validity of Inference and Argument**<sup>∗</sup>

Dag Prawitz

It has been common in contemporary logic and philosophy of logic to identify the validity of an inference with its conclusion being a (logical) consequence of its premisses. This identifcation pays attention to at most a necessary condition for an inference being acceptable in a deductive argument or proof. An inference is not acceptable unless the conclusion becomes evident because of being supported by the premisses. Can we defne this condition in a stringent way so that we get a concept of valid inference allowing us to characterize a proof or valid deductive argument as a chain of valid inferences? This is the main question that I shall be concerned with in this essay. By the validity of an inference I understand henceforth a concept of that kind, which I shall strive to explicate here.

### **1 The concept of inference**

Before entering the main discussion of what should be required of a valid inference, we should pay attention to the concept of inference. It is reasonable to think that the validity of an inference should be connected with the activity of making inferences

Dag Prawitz

University of Stockholm, Sweden, e-mail: dag.prawitz@philosophy.su.se

<sup>∗</sup> It is a great pleasure to me to submit this essay to a volume devoted to Peter Schroeder-Heister in the series of "Outstanding Contributions to Logic". I have had the privilege to follow part of Peter's career from his doctoral dissertation. His work in proof-theory has infuenced my own in substantial ways. In the last years we have had intensive discussions on the topic of this essay. As indicated in the essay, I am now following a path partly diferent from the one he argues for. The concept of valid inference is however a difcult one to explicate, and as can be seen from this essay, I have tried several blind alleys in the past. I hope for a continued discussion and co-operation with Peter on these issues and a deepened understanding of validity.

and in particular with what we expect to achieve when we perform this activity and infer a conclusion from some premisses.1

Regarded as a mental act, an inference comprises a number of judgments and consists in a transition to one of them, the conclusion, from the other ones, the premisses. In this transition, the premisses are held to support the conclusion, which thereby is taken to be justifed. In a *deductive* inference the conclusion is held to get a *conclusive* support or, as one also says, to be provided with a *binding* ground; these attributes will usually be left out since we are concerned here with deductive inference exclusively.

When an inference is verbalized, it becomes a compound speech act comprising a number of assertions. They can be seen as manifestations of judgements, performed by uttering sentences with assertive force. That the conclusion is taken to be supported by the premisses is then typically indicated by inserting a prefx, like "therefore" or "hence", in front of the conclusion, or when the conclusion is stated frst, by beginning the premisses with a word like "since" or "because". Inferences will here be seen primarily as speech acts of that kind.

An inference is thus not just a succession of assertions. Its crucial feature is that one of the assertions, the conclusion, is held to be supported by the other assertions, the premisses.2 To support the assertion that appears as conclusion and to justify it thereby is also the very aim of the inference. An individual inference may of course be driven by all kinds of diferent individual aims. But the *characteristic aim* (to use a term from speech act theory) of inference seen as an act-type is to obtain a support for the conclusion. When one holds the conclusion to be supported by the premisses, one thus understands the inference act as having been successful in attaining its aim. What precisely it amounts to for an inference to provide its conclusion with a conclusive support is a question that we have to come back to when trying to explicate the concept of valid inference; the person who makes an inference need not have an answer to this question but may nevertheless be right in holding the premisses to support the conclusion.

This view of inferences as transitions from categorical assertions already justifed to conclusions that become justifed agrees essentially with how Frege saw inferences. Established deductive practice knows however plenty of inferences that do not conform to Frege's picture. Reductio ad absurdum, frequently used already at the time of the Greek antiquity, is an example. It presupposes reasoning from assumptions not considered by Frege (although his two-dimensional way of writing formulas could be said to be a way to represent assertions made under assumptions3). Such reasoning got an explicit and regimented form only later with Gentzen (1935) and Jaśkowski (1934).

A full account of deductive inferences should pay attention also to such reasoning. We shall therefore allow not only categorical assertions but also hypothetical ones, even called assertions made under assumptions. Furthermore, we shall allow assertions

<sup>1</sup> That inferences are primarily acts has been emphasized in contemporary logic by Martin-Löf and Sundholm in particular; see for instance Martin-Löf (1985) and Sundholm (1998).

<sup>2</sup> This feature of inferences is also stressed by Boghossian (2014).

<sup>3</sup> See Tichý (1988), von Kutchera (1996), and Schroeder-Heister (2014).

and assumptions that are unsaturated or open, expressed by sentences containing free variables.

This makes inferences more complicated as compared to what was said above, since in addition to being transitions from premisses to conclusions they may also discharge or, as I shall say, *bind* assumptions that the premisses depend on. They may also bind variables that occur free in asserted sentences. The terminology is to hint to how we understand reasoning with assumptions and variables that are free in the argument, not bound by any inference, namely as a kind of schematic reasoning intended to remain correct if free variables are replaced by closed terms and free assumptions are replaced by valid arguments for them; we shall return to this later to make it clear.

#### **2 Arguments**

To describe how inferences bind variables and assumptions we have to consider *arguments*, by which we shall understand reasoning that proceeds by making a number of inferences chained to each other so that the conclusion of one inference also becomes a premiss of another.

The validity of an inference or argument should of course not depend on who makes the inference or in what situation or at what time the inference is made, provided no indexicals are involved,4 which I presuppose here. We may therefore restrict ourselves here to *generic acts* where we have abstracted from such features of individual inferences or arguments.

An argument, that is, a generic argument act, is determined by its inferences, their ordering, and information concerning initial premisses about whether they are asserted outright, categorically, or occur as assumptions, a category of speech acts of its own. For convenience I shall sometimes count initial premisses that are asserted outright as inferred by inferences from zero premisses.

The inferences of the argument are in turn determined by their premisses and conclusions and by the variables and assumption occurrences that they bind; they may be seen as determined by yet other factors, but here I restrict myself to the mentioned ones.

When we make an argument its inferences become of course ordered linearly by time, but for the purpose of logic it is sufcient and in fact more to the point to require that the ordering is a strict partial order such that for each inference, except for one last inference, its conclusion is also the premiss of the immediately succeeding inference. This ordering gives rise to a strict partial ordering of the assertion occurrences of the argument, too. There is thus one last conclusion of an argument, called its *fnal conclusion*. Each occurrence of an assertion determines a *subargument*, namely the initial part of the argument that has the occurrence as its fnal conclusion. By

<sup>4</sup> I owe this proviso to Cesare Cozzo.

the *immediate subarguments* of a given argument are understood the subarguments determined by the premisses of the last inference of the argument.

In ordinary deductive practice, an inference that binds an assumption binds all its occurrences. But when studying inferences on a meta-level, considering among other things operations on them, it is important to allow an inference to bind only some occurrences of an assumption.

The dependency on assumption occurrences is defned inductively: An assumption occurrence *depends* on itself; the conclusion of an inference *depends on* every assumption occurrence that a premiss depends on and is not being bound by the inference. An occurrence of an assertion in an argument depending on the set of assumptions Γ is said to be an *assertion under the (set of) assumptions* Γ. An argument Π whose fnal conclusion is an occurrence of A depending on the set of assumptions Γ is said to be an *argument for* the assertion A *from (the set of assumptions)* Γ; when the fnal conclusion does not depend on any assumption, Π is said to be an *argument* for A.

When an inference binds a variable, it binds all occurrences of the variable that are free in the assertions of the subargument determined by a premiss and are not bound by an inference in the subargument. Such binding is not to occur if the variable has free occurrences in an assumption that the conclusion depends on; a restriction imposed in order that the replacement described at the end of the previous section is to give the desired result.

An argument is said to be *closed* when all its assumption occurrences as well as all the variables that occur free in an assertion of the argument are bound. It is said to be *open* otherwise. Note that an occurrence of a variable in an assertion of an argument may be free *in the assertion but bound in the argument*.

If Π is an argument for the assertion A and Σ is an argument in which A occurs as a free assumption, we understand by a *composition of* Π *and* Σ a result of putting the two arguments together by letting one or several free occurrences of the assumption A in Σ come after an occurrence of Π; these assumption occurrences in Σ are in this way replaced by arguments for them.

#### **3 Representations of inferences and arguments**

We have to distinguish between generic inferences considered in isolation and inferences occurring in an argument. For inferences that do not bind assumptions or variables — let us call them *simple inferences* — there is no diference: A simple inference is determined by its premisses and conclusion. It can be represented by a fgure of the form

$$\begin{array}{c c c c} A\_1 & A\_2 & \dots & A\_n \\ \hline & B & & \\ \end{array}$$

where is the sentence asserted by the conclusion and 1, 2, . . . , are the sentences asserted by the premisses of the inference. (This common way of representing

generic inferences introduces an order between the premisses that is insignifcant but will actually be used sometimes as a way of reference.)

In the general case, a generic inference is determined also by the variables that it binds and the assumptions that it may bind. It can be represented by a fgure of the form

where 1, 2, . . . , , and are again sentences, is a set of variables, and Γ is a set of sentences. When applied within an argument, all free occurrences of the variables of in assertions of the argument for not already bound by other inferences become bound, and furthermore occurrences of assumptions that have the shape of a sentence in Γ and have in its scope may become bound ( ≤ ); not until the generic inference is applied in a specifc argument is it determined which assumption occurrences become bound — the generic inference determines only which assumptions *may* become bound. For convenience, I presume that the variables in do not occur in .

Figures of the form exhibited above that represent generic inferences can be called *inference fgures*. An *inference rule* or *schema* is like an inference fgure but instead of containing sentences, predicates, individual variables, and individual terms it may contain schematic letters for them. An *instance* of an inference rule is obtained by replacing the schematic letters by specimens of the kind that they stand for, and is thus a generic inference (fgure).

A generic argument act can be represented conveniently by a tree of assertions, which in turn may be represented by writing Frege's assertion sign in front of the sentences asserted— or, dropping the assertions sign, by just the sentences asserted. At the top of the tree are put sentences representing the initial premisses with information about whether they represent categorical assertions or assumptions; in the latter case the sentence represents both the assumption made and the assertion of the sentence under that assumption. Going down in the tree, we put successively sentences that represent the assertions inferred. The binding of variables and assumptions are to be marked at the inference where it occurs (e.g., one may attach the same numeral to an assumption and to the inference that binds it).

Note that when an argument is represented in this way by a tree of sentences, an assertion of a sentence under the assumptions Γ is represented by just the sentence ; thus, it is only that appears as a premiss or conclusion of an inference — the assumptions Γ that depends on are easily read of from the tree. This is Gentzen's original way of arranging his natural deductions; arguments difer from them only in the respect that its inferences need not be instances of predetermined rules but can be of any kind.5 , 6

When we want to distinguish the representations of arguments from the arguments that they represent, we may call them *argument fgures*. They may be seen as protocols of argument acts in which all the features of the acts that matter logically are noted down.

The inferences of an argument can be seen as *applications* of generic inferences. We shall allow that a substitution of terms for free variables is made at such applications; as usual it is here taken for granted that the terms do not contain free variables that become bound in the result of carrying out on . Let G be a generic inference represented by the inference fgure exhibited above. An inference of an argument is an *application of* G, if and only if, for some (possibly empty) substitution of terms for variables diferent from the ones of and occurring free in , or sentences of Γ it holds: 1) the premisses and conclusion of the inference are 1 , 2 , . . . , , and , respectively, 2) the inference binds all occurrences of the variables of that stand free in assertions of the argument for the premiss and 3) the inference binds at most some occurrences of assumptions of the form of a sentence of Γ in the argument for that depends on ( ≤ ). Note that there can be several diferent generic inferences of varying generality that an inference of an argument is the application of; replacing an individual term of a sentence of a generic inference with a variable, we get a new more general generic inference that has as applications all the applications of the frst more specifc generic inference. By a *result of applying* G to a sequence or a set of arguments {Π1, Π2, . . . , Π} we shall understand an argument whose last inference is an application of G and whose immediate subarguments are Π1, Π2, . . ., and Π; their fnal conclusions have to be 1 , 2 , . . . , for some substitution .

Since all particular features of generic acts of inferences or arguments that are logically signifcant are present also in the fgures that represent them, we may as well make the syntactical representations instead of the acts themselves the object of

$$\frac{Pc}{\exists x Px} \text{ (assumption)}\newline\qquad\qquad\qquad\qquad\frac{Pc \implies Pc}{Pc \implies \exists x Px} \text{ }.$$

<sup>5</sup> An alternative is to represent the assertions of the argument by sequents Γ =⇒ where Γ is a sequence of sentences representing the assumptions that is asserted under. Premisses and conclusions of inferences will then be represented by sequents instead of sentences. In some later publications Gentzen adopted this way of writing natural deductions. At this level it is only a question of alternative representations of one and the same argument act. However, if we follow Sundholm (2006) and understand a sequent 1, 2, . . . , =⇒ as representing an assertion saying "if <sup>1</sup> is true, <sup>2</sup> is true, . . ., and is true, then is true", the tree of sequents will primarily represent not reasoning from assumptions but reasoning starting from axioms of the form "If is true, then is true". The inferences too take diferent forms; cf.

<sup>6</sup> The Curry-Howardisomorphism suggests thatthere is also an alternative representation of arguments by terms in an extended lambda calculus containing parameters for functions corresponding to diferent arbitrary inferences of an argument. However, to establish really that this is a possible way to represent arguments, we have to pin down what it is to argue from assumptions, which is what is attempted here, partly by using a representation that is closer at hands.

study. The fgures may simply be called inferences and arguments, respectively, as is customary in logic, the representation of an argument for the assertion of a sentence may for simplicity be called an argument for , and so on. It remains however that when discussing their validity, one should recall that the syntactical objects are representations of acts with aims; this general feature of the generic acts is of course lost when they are represented by fgures.

#### **4 Soundness and validity of inferences. A frst approximation of validity**

As remarked in the introduction, the validity of an inference has commonly been identifed with the holding of the relation of (logical) consequence. The inferences or rather inference fgures that one has in mind here are the simple ones at which no assumptions or variables are bound. Such an inference fgure

$$\begin{array}{c c c c} A\_1 & A\_2 & \dots & A\_n \\ \hline & B & & \\ \end{array}$$

where and are closed sentences, is valid, one has said, when the inference is necessarily truth preserving, spelled out by saying either that it is impossible that all the premisses ( ≤ ) are true while the conclusion is false, or that necessarily if all ( ≤ ) are true, then so is . These two conditions are of course equivalent classically and have also been used alternatively in the traditional defnition of the relation of entailment or consequence.

With Bolzano and Tarski the modal notion of necessity or impossibility is replaced with a variation of the meaning of the non-logical terms of the sentences involved and of the individual domain. We then get the well-known defnition saying that a sentence is a logical consequence of a set of sentences Γ, when is true under each variation of assignments to the non-logical terms and of the domain of the individual variables under which all the sentences of Γ are true. This has become the dominant defnition also of the validity of an inference from Γ to in contemporary philosophy and logic.

The concept of valid inference that we are concerned with in this essay must obviously be very diferent from that notion of valid inference. Even if one considers only simple inferences, that notion demands both too much and too little from the perspective of this essay. Although clearly a valid inference cannot have true premisses and a false conclusion, inferences by for instance mathematical induction come out as non-valid when it is required that a valid inference preserve truth under all variations of the meaning of the non-logical terms and of the domain of the variables. More importantly, the prevalent notion demands too little. To establish that something is a logical consequence of something else we usually need a proof, often a long proof with many inferences. What is to be required of these inferences cannot then be that the sentence asserted in conclusion is a logical consequence of the sentences asserted in the premisses; if that was sufcient, there would never be a need of proofs containing more than one inference step.

The property of inference that has commonly been called validity is nevertheless a signifcant one, and I propose that it is called *soundness*. This would be in agreement with established terminology in connection with deductive systems, which are called sound when their inference rules preserve truth.

Already at the beginning of logic one was interested in distinguishing a kind of inferences that satisfed stronger demands than soundness. Aristotle distinguished between *syllogisms* in general and *perfect syllogisms* saying:

A syllogism is a form of speech in which, certain things being laid down, something follows of necessity from them.

A perfect syllogism is one that needs nothing other than the premisses to make the conclusion evident.7

Aristotle's general notion of syllogism (not restricted to the particular inferences that he studied in detail) has been a common point of departure for discussions and diferent proposals about what later became called valid inference. In contrast, there has been little interest in trying to develop his narrower notion of perfect syllogism; the attention it is has received seems mostly to have been of an exegetical kind about what Aristotle intended with that notion. Having an epistemic ingredient, it seems to be in the same direction as the concept that is focused on in this essay.

The term "evident" used in the above translation of Aristotle's defnition of perfect syllogism may seem to be natural to use here in view of its etymology: when an inference is valid, it should be "seen" that the conclusion is right given that the premisses are.8 However, the term is not to be understood here as referring to the actual state of mind of a person when something is obvious to her. We do not want to say that an inference is valid for a person, nor is it likely that Aristotle meant that a syllogism can be perfect for one person but not for another. The term must therefore be understood here not primarily as a psychological term but as referring to an objective property: A valid inference or perfect syllogism gives evidence to the assertion made in the conclusion in the sense that it gives a ground for the assertion, which thereby becomes justifed. When understood in this way "evidence of an assertion" may be used interchangeably with "ground for an assertion". It is another thing that the existence of a ground for an assertion can in principle become known and therefore makes the assertion potentially evident to a person.

The notion of ground has of course a broader use. We are here interested in epistemic grounds. A speaker is normally expected to have some kind of epistemic ground for what he or she asserts. The nature of what is counted as such grounds for assertions varies with diferent kinds of assertions. For assertions with empirical content a ground may be obtained by observations under suitable circumstances. A ground for the assertion of an arithmetical identity may be got from a computation. How good the ground is required to be varies with the context. In some contexts, for

<sup>7</sup> Ross (1949, p. 287).

<sup>8</sup> The term is used by Martin-Löf and Sundholm too in their writings on the validity of inference, see, e.g., Martin-Löf (1985) and Sundholm (2004).

instance in mathematics, the ground is expected to be binding, and this is the case that concerns us now.

When we make an inference, it is understood that the ground for the conclusion comes from the premisses. But how? The premisses of an inference are sometimes called grounds for the conclusion, but clearly the premisses themselves do not constitute grounds for the premisses. The ground for the conclusion must rather come somehow from their grounds, which we take implicitly to exist since they are asserted.

For the inference to be valid, there must be some immediacy in how the ground for the conclusion comes from the grounds for the premisses. No further inferences should be needed to obtain the ground. This is a point that Aristotle perhaps wants to make when he says that the perfect syllogism "needs nothing other than the premisses".

The validity of an inference thus requires that a ground for the conclusion appears directly given any grounds for the premisses; "given" in the sense of being at least assumed to exist. One should expect furthermore that the meanings of the sentences involved could be crucial for the validity of the inference.

The concept of validity is to be tied primarily to generic inferences. What has been said so far may be put together as a frst rough approximation of their validity:

For a generic inference to be valid it must be required that in virtue of the meanings of the involved sentences, it appears directly, without any further inferences, that given any grounds for the premises, there is a ground for the conclusion.

If the stated requirement is not satisfed, the assertion that occurs in the conclusion is made without a ground, or at least not with a ground coming from the premisses, and the inference cannot then be valid. The requirement is also sufcient for the validity of an inference: when it is satisfed a person who makes the inference is being provided with a ground for the conclusion, or can at least easily provide herself with such a ground, given that she knows the meaning of the involved sentences and has grounds for the premisses; the assertion appearing as the conclusion of the inference thereby becomes justifed and what was aimed at when making the inference is thus achieved. An argument consisting of inferences that all satisfy the proposed requirement gives a ground for its fnal conclusion, and if closed it can then be called a proof. To call such inferences valid is thus in accordance with my introductory declaration.

One may object to the requirement of directness and remark that a challenge of an inference is typically met by inserting a number of other inferences between the premisses and the conclusion. If these inferences are accepted as valid, one normally accepts as valid also the challenged inference. But this objection is built on another concept of validity than the one we are now concerned with. We want to clarify what may be called *immediate validity* where the point is that the inference should satisfy certain requirements as it stands (without adjuncts). When that has been clarifed, one can easily defne what it is for a simple inference to be *mediately valid*: there is an argument for the conclusion of the inference from its premisses that uses only immediately valid inferences. The problem is to explicate immediate validity, and I

shall continue to refer to it simply as validity, using the term mediate validity for the property that can then be defned in terms of it.

To get on with this task we must especially inquire what constitute grounds for assertions in the present context. In mathematics we expect since the time of Greek antiquity that grounds for categorical assertions come in the form of deductive proofs, valid closed arguments. At least in the case of categorical assertions of logically compound sentences, we know no other way to obtain conclusive justifcations. In case the assertions are not categorical but hypothetical or open, the grounds take the form of valid open arguments.

However, if we explain proofs as valid closed arguments and valid arguments as arguments consisting of valid inferences, as I have suggested above, and then explain the validity of inferences in terms of grounds explicated as valid arguments, we are of course moving in a circle.9 This may seem disastrous for the proposed explanation of valid inference and valid argument that I have just begun.

#### **5 Other concepts of proof**

Is there a way to avoid this circularity problem? To turn to other ways of understanding the concept of proof may be thought to be a possibility. In the philosophy of intuitionistic mathematics, a proof has been seen not as a chain of valid inferences but as a mathematical construction. A sentence is taken to express the intention of a construction and to prove the sentence is to realize this intention.10 More precisely, a proof is the construction process that results in the intended construction expressed by the sentence.11

In the so-called BHK-interpretation12 as usually understood, a shift occurs so that a proof becomes rather the intended construction itself, not the realization process that establishes the existence of the intended construction. The proofs are there defned by recursion over the build-up of the proved sentences; for instance, "a proof of → is a construction that permits us to transform any proof of into a proof of ".13 However, the construction intended by a sentence cannot in itself in general constitute a ground for asserting the sentence. To have defned a construction that in fact transforms any proof of into a proof of does not justify the assertion of

<sup>9</sup> This circularity problem was noted in several lectures by Martin-Löf in the last decade. The problem is noted in one of his earlier papers too (Martin-Löf, 1985), where he saw it as a mistake to take the concept of (valid) immediate inference as conceptually prior to the concept of proof and concluded: "inference and proof are the same".

<sup>10</sup> Heyting (1934).

<sup>11</sup> As Heyting (1958) puts it: "The steps of the proof are the same as the steps of the mathematical construction." See also Sundholm (1983) concerning the ambiguity of the term construction.

<sup>12</sup> Stated by Troelstra and Dalen (1988).

<sup>13</sup> Troelstra and Dalen (1988, p. 9).

the implication → unless we have some ground for holding that the defned construction does efect the transformation in question.14

When Heyting's general idea of proofs as the realizations of intended constructions is further developed as is done in Martin-Löf's type theory, we are lead to a construction process in which at each step it is also demonstrated that the construction obtained is of the right, intended kind.15 The process will thus contain a chain of inferences. We are then back to a notion of proof that presupposes the notion of valid inference.

We thus fnd that a proof in intuitionistic mathematics is either an intended construction of a proposition or a demonstration establishing that a given construction is the intended construction of a certain proposition. In the frst case it does not constitute in itself a ground for asserting the propositions, and in the second case it is a chain of valid inferences. In neither case does it ofer a solution of our problem.16

#### **6 The conceptual order between valid inference and valid argument**

Sticking to the idea of proofs as chains of valid inferences, one may contemplate as a way out of the circularity problem the possibility of frst defning the concept of valid argument without referring to the validity of inference and then the concept of valid inference in terms of it, thus turning up-side down what has seemed to be the natural conceptual order.17 A proposal for how to take the last step from valid argument to valid inference is to defne an inference as valid when any application of it to valid arguments is a valid argument.

The proposed defning condition should in fact be a necessary condition for the validity of an inference in accordance with the basic idea concerning how open

<sup>14</sup> For this reason, in the BHK-interpretation frst presented by Troelstra (1977, p. 977), the proof of an implication → did not consist of just a construction that in fact transforms any poof of into a proof of but contained also "the insight that has the property: proves =⇒ proves ". To include such an "insight" into the proofs is of course difcult to make compatible with the intuitionistic idea that proofs are mathematical objects, but by dropping this element from the proofs as was done in the later and more well-known BHK-interpretation presented by Troelstra and Dalen (1988), the BHK-proofs lost the general epistemic power to justify the assertion of sentences that they are proof of. See also fn. 18.

<sup>15</sup> See for instance Martin-Löf (1984).

<sup>16</sup> In several papers (see for instance Prawitz 2015b; 2019b), I have discussed in a positive vein the possibility of identifying the grounds for asserting sentences with the intended construction that the sentences are taken to express when understood in an intuitionistic sense, although noting that the way they are defned must be restricted. I do not anymore consider this approach as promising since I know no way of making the right restriction. It is true that when the constructions are restricted to be what can be defned in certain extended lambda calculi it can be seen directly from well-formed terms that they denote the intended constructions of certain sentences. But the constructions intended by sentences of sufcient complexity cannot be exhausted by what can be defned in formal systems. 17 I proposed a defnition of such a notion of valid argument at a fairly early stage (Prawitz, 1973). It was later taken up and somewhat modifed by Dummett (1991) and Schroeder-Heister (2006) and was modifed more recently and more radically by myself (Prawitz, 2019a).

valid arguments are to be understood as schematic reasoning, presented at the end of Section 1. That idea can be stated more precisely in the form of a principle as formulated below. To get a short formulation let us say that a *regular instance* of an argument is obtained by frst replacing a number of variables that are free in the argument by terms and then, in the resulting argument, replacing a number of free assumptions by valid arguments for them; the order is important in case one wants to obtain a closed regular instance.

*Principle concerning open arguments and their instances* All regular instances of valid arguments are valid.

Since an inference can be seen as a one-step open argument whose premisses are assumptions and the result of applying it to valid arguments is a regular instance of the argument, we fnd in particular that it is a necessary condition for the validity of an inference that applications of the inference to valid arguments result in valid arguments.

However, the condition is not sufcient. If there is no valid argument for the assertion of , the inference from the assertion of to any other assertion satisfes the condition vacuously.18 For instance, since, as we know from the proof of Fermat's Theorem, there is no closed valid argument for the premiss of the following inference

$$\frac{\exists x \exists y \exists z \, x^3 + y^3 = z^3}{\bot}$$

(the variables being supposed to range over positive integers), the inference comes out as valid according to the proposed defnition of validity. Of course, the inference should come out as mediately valid, but certainly not as (immediately) valid; contrary to what Fermat thought, we seem to need a quite long and complicated argument to refute the assumption ∃∃∃ <sup>3</sup> + <sup>3</sup> = 3 .

The proposed defnition of valid inference in terms of valid argument thus fails, and I see no way of attaining such a defnition. The validity of arguments seems best explained by saying that a valid argument is one whose inferences are all valid. If the validity of inferences in turn is explained in terms of grounds as proposed above and grounds for an assertion consist in valid arguments for them, we must conclude that the concepts of valid inference and valid argument depend on each other and cannot be defned in isolation. If so we have to be satisfed with stating principles about how they are related to each other and to some other concepts.19

<sup>18</sup> The converse of the principle, in particular the idea that an open argument is valid if all its closed, regular instances are, therefore fails too. It was a substantial part of the defnition of valid argument mentioned in fn. 17. The notion of hypothetical proof proposed by Martin-Löf (1985) sufers from the same problem: any argument from the assumption of a false sentence satisfes vacuously his defning condition for being such a proof. Similarly every false sentence comes out as having a BHK-proof: There is a construction that permits us (vacuously) to transform any proof of a false sentence into a proof of ⊥, namely the (empty) function that is defned for all proofs of and, for each such proof, assumes as value a proof of ⊥.

<sup>19</sup> In the sequel I shall take this mutual dependency between valid inference and valid argument as a working hypothesis. I want nevertheless to keep open that the validity of inferences could be

## **7 Principles and heuristic ideas on the validity of inferences and arguments**

A basic principle, which expresses intuitions already referred to, is:

*Principle 1. The relation between validity of inferences and validity of arguments* An argument is valid, if and only if, all its inferences are valid.

This principle establishes a relation between the validity of an argument and the validity of the inferences that the argument consists of. We have to relate the latter validity to the validity of generic inferences that the concept of validity is primarily tied to, as suggested above. An inference of an argument can always be seen as an application of a generic inference, but as noted above (Section 3), it may be the application of several diferent generic inferences of varying generality. To be counted as valid it should be sufcient that it is an application of one valid generic inference (which is the same as saying that the least general generic inference that it is an application of is valid). We thus defne:

#### *Defnition 1.*

An inference of an argument is valid, if and only if, it is an application of a valid generic inference.

Principle 1 does not amount to a defnition of the concept of valid argument as long as we lack a defnition of the concept of validity of generic inference not depending on the concept of valid argument, but it is still informative about the involved concepts and has several immediate corollaries. Some of them are noted below for later use:

#### *Corollary 1.*

All results of applying a valid generic inference to a set of valid arguments are valid arguments.

*Proof.* Consider a valid generic inference G and let Π be the result of an application of it to a set of valid arguments. By Principle 1 all inferences of the arguments of are valid, and by the defnition above so is the application of G. Since all the inferences of Π are hence valid, Π is valid by Principle 1, now used in the other direction.

#### *Corollary 2.*

A subargument of a valid argument is valid.

*Proof.* Let Π be a valid argument and let Π ′ be a subargument of Π. By Principle 1 all the inferences of Π are valid and hence so are all inferences of Π ′ . The validity

explained in other ways without reference to grounds for the involved assertions. For example, in discussions about the validity of inferences that Peter Schroeder-Heister and I have had, he has suggested that one should demand more of a valid inference than I have done here. It should not only give a ground for the conclusion in the form of a valid argument for it when valid arguments for the premisses are given, but should more generally guarantee an argument for the conclusion, good or bad, of the same quality as the given arguments for the premisses. This stronger requirement should be possible to express without referring to valid arguments, he suggests.

of the Π ′ follows by using Principle 1 in the other direction. By the same kind of reasoning we get:

*Corollary 3.* A composition of two valid arguments is valid.

The part of the principle stated in the previous section that concerns the result of replacing free assumptions of a valid argument by valid arguments for them follows as a corollary of Principle 1 since they can be seen as the efect of an iterated composition — the result is thus valid by Corollary 3. The other part is an independent principle, which we note down here:

*Principle 2. The relation between an argument and its substitution instances* The result of substituting terms for variables that are free in a valid argument is valid.

The principle of the previous section is thus obtained as a corollary of Principles 1 and 2:

#### *Corollary 4.*

All regular instances of valid arguments are valid.

I shall presuppose that the languages in which inferences are verbalized have closed terms for all individuals in the domain that the variables are intended to range over; as usual the domain is supposed to be non-empty. Consider under this presupposition the following somewhat strengthened converse of Principle 2: An open argument A () with free variables is valid, if all results A () of substituting closed terms for are valid. Is it a reasonable principle?

The answer must clearly be no since that would be contrary to the idea of valid inference discussed here: the fact that all the arguments A () are valid cannot be sufcient for the validity of the open argument A (), unless this fact appears directly from made inferences and the meanings of the involved sentences. We shall return to this issue when now returning to what I called a frst approximation of the validity of inferences (Section 4).

This frst approximation now amounts to another basic idea concerning the relation between validity of inference and validity of argument when having acknowledged that grounds for assertions consist of valid arguments for them. It now reads as follows when put in the form of an equivalence and restricted to simple inferences (which do not bind anything) where the premisses and conclusions are closed, valid arguments for them therefore amounting to proofs:

A simple generic inference whose premisses and conclusion are closed is valid, if and only if, in virtue of the meanings of the involved sentences, it appears directly, without any further inferences, that given any proofs of the premisses, there is a proof of the conclusion.

It follows from Principle 1 that a necessary condition for the validity of a generic inference is the existence of a valid argument for the conclusion given valid arguments for the premisses; indeed, the result of applying the inference to the valid arguments

for the premisses is such a valid argument for the conclusion according to Corollary 1. The equivalence above states a necessary condition for the validity of a simple generic inference (whose premisses and conclusion are closed) that is in some respect weaker and in some respect stronger than the condition of Corollary 1. It is weaker since it requires only the existence of *some* valid argument for the conclusion. It is stronger since it requires that this existence appear directly in virtue of the meaning of the involved sentences.

Furthermore, the equivalence provides a sufcient condition for the validity of an inference. On the meaning theory presented in the next section there are cases, namely so-called introduction inferences, where the very application of a generic inference to valid arguments results in an argument that is valid in virtue of the meaning of the conclusion, and where the inference is thus valid according to the equivalence above. But in other cases we shall have to fnd another valid argument for the conclusion in order to establish the validity of an inference via the sufcient condition stated by the equivalence.

It is to be recalled that we are now concerned with the validity of generic inferences, in terms of which the validity of the inferences of an argument is defned. A valid argument for the conclusion of a particular inference in a given argument may appear directly from the arguments for the premisses, given that they are valid, but this is not sufcient for the validity of the inference; otherwise the inference from ∨ to would come out as valid when ∨ has been inferred after having obtained a proof of . The above condition of validity requires that given *any* proofs of the premisses, a proof of the conclusion appear.

When an application of the generic inference binds variables and may bind occurrences of assumptions at a premiss, the ground for that premiss assumed to exist in the condition for the inference to be valid takes the form of a valid argument for the premiss from the set of assumptions whose occurrences it may bind. However, when the premiss is an open assertion, the condition for validity must require more. Consider the inference represented by the fgure

$$\frac{A(x)}{B(x)}$$

where () and () are sentences that contain one free variable . For it to be valid, it must appear directly that given any proof of the assertion of (), there is a proof of the assertion of (), where is any closed individual term. This condition is also sufcient for the validity of the inference under the presupposition made above that each individual in the intended domain is denoted by some term. That a valid argument for asserting () appears directly given a valid argument for asserting () is a weaker condition that would not guarantee that there is a proof of () given a proof o ().

This means that the general condition for the validity of a generic inference is most conveniently formulated in terms of applications of the inference, as that notion was defned in Section 3: for any application of the inference to a set of valid arguments resulting in a closed argument Π, there is to appear a proof of the fnal conclusion of Π. We then arrive at the following reformulation of the frst approximate explication of the validity of inference:

*Heuristic idea on the validity of inference in terms of valid arguments* A generic inference is valid, if and only if, in virtue of the meanings of the involved sentences, it appears directly without any further inferences that, for any application of the inference to a set of valid arguments resulting in a closed argument Π, there is a proof of the fnal conclusion of Π.

I call it a heuristic idea because of the vagueness of the expression "it appears directly". A condition of that kind is needed for at least two reasons. One is again the need to avoid the problem of vacuity discussed in the previous section: We do not want inferences to come out valid vacuously just because there are no valid arguments for their premisses. Such an outcome is meant to be blocked when it does not appear directly from the meaning of the premisses that there are no valid arguments for them. To illustrate again with Fermat's theorem: Although it is right that when a generic inference has ∃∃∃ <sup>3</sup> + <sup>3</sup> = 3 as premiss any result of applying it to valid arguments satisfes vacuously whatever condition we choose (there being no valid argument for the premiss), this fact does not appear directly in virtue of the meaning of ∃∃∃ <sup>3</sup> + <sup>3</sup> = 3 . In contrast, the generic inference

$$\frac{\pm}{A}$$

comes out as valid according to the heuristic idea above as it should if we have explained the meaning of ⊥ by saying that there is no proof of ⊥; it can then be said rightly that it appears directly in virtue of the meanings of the involved sentences that, for any application of the inference to a valid argument for ⊥, whatever condition we choose is satisfed.

The requirement of directness is also meant to block that the mere existence of a closed valid argument for the conclusion is sufcient for the validity of a generic inference. For an illustration, consider the generic inference

$$\frac{\exists x \forall y A(x, y)}{\forall y \exists x A(x, y)}$$

Given any valid argument for the assertion of a sentence ∃∀(, ) (not depending on assumptions), an argument for the assertion of ∀∃ (, ) can easily be constructed by using inferences that should certainly come out as valid, but since the construction uses additional inferences, it does not appear directly.

A heavy burden is thus put on the meaning of the vague term directness. In spite of these shortcomings, the stated equivalence will serve as a heuristic guide when searching for additional, more precise principles about how valid inferences and valid arguments are related to each other.

#### **8 Meaning of assertions and validity of inferences**

There are inferences whose validity is independent of the assertions involved. An example is the generic inference, represented by the inference fgure

$$\begin{array}{c} \begin{array}{c} [A] \\ \mid \\ \hline A \\ \hline B \end{array} \end{array} $$

(sometimes called the rule of explicit substitution when taken as an inference rule). For any application of this inference to valid arguments that results in a closed argument Π, there is a proof of its fnal conclusion, because the composition of the two immediate subarguments of Π is such a proof when each free occurrence of the assumption in the second subargument is replaced by the frst subargument; a composition of two valid arguments being valid by Corollary 3 of Section 7. Provided the appearance of this proof by forming a composition of the two valid arguments to which the inference is applied is counted as direct, the inference is thus valid according to the heuristic idea formulated above regardless of what sentences and are.

It is surely more common that the validity of an inference depends on what the assertions involved mean. For instance, whether the two generic inferences represented by the fgures

$$\frac{A}{A \lor B} \qquad\qquad \frac{B}{A \lor B}$$

are valid cannot but depend on the meaning of sentences of the form ∨.

As argued by Michael Dummett, the meaning theory for a language should account for all features of the use of the language that depend on the meanings of its sentences, including the acceptance of inferences like the ones above as valid. For it to fulfl this task it is essential how the meanings of sentences are given. A truth-conditional meaning theory of the usual kind states the condition for a sentence to be true in such a way that it may not be possible to derive from that on what kind of grounds an assertion is accepted as justifed. In contrast, what Dummett calls a verifcationist or justifcationist meaning theory explains the meaning of a sentence directly in terms of what counts as a ground or valid argument for asserting the sentence.

Gentzen had an idea about the meanings of logical constants which is a forerunner to Dummett's idea of justifcationism. A special feature of Gentzen's system of natural deduction is that for every logical constant there are a number of inference rules called introduction rules for , or simply *-introductions*, where the conclusions are sentences whose outermost sign is . After having set up his system, Gentzen remarked that the meaning of a logical constant could be seen as being determined by the c-introductions.20 It is not obvious how this suggestion is to be understood. One

<sup>20</sup> "The introductions present, so to say, the 'defnitions' of the symbols concerned." (Gentzen, 1935, p. 189).

element is of course that, in virtue of the meanings of the logical constants, instances of the introduction rules are to be seen as yielding valid arguments when applied to valid arguments for its premisses. But -introductions do not constitute the only valid ways of inferring sentences that have as their outermost sign. So what is special about the introduction rules that gives cause for taking them as meaning constitutive?

One way to answer this question is to say that the c-introductions present the *direct* or *canonical* way of inferring sentences that have the constant as the outermost sign. This is to be understood as implying, not only that applications of instances of introductions to valid arguments yield valid arguments as results, but also that if the assertion of a sentence is provable at all, then in principle its proof can be put in such a form that the last inference is an instance of an introduction. An argument whose last inference is an instance of an instance of an introduction is said to be in *canonical form*21 (whether open or closed and irrespective of question of validity).

To exemplify: that the meaning of the disjunction sign is determined by the ∨-introductions whose instances are generic inferences of the form exhibited above, is to be understood as saying that the meaning of disjunction is such that 1) the results of applying to valid arguments generic inferences of the kind exhibited above are valid, and 2) if there is a proof of a sentence ∨, then there is such a proof in canonical form. Hence, if ∨ is provable, there is a proof of either or .22

As seen, this fts well with how ∨ is understood intuitionistically, but not with how it is understood classically since a disjunction may be provable classically while neither of the disjuncts is provable. Gentzen's ∨-introductions can thus be seen as determining the meaning of intuitionistic disjunction, but not of classical disjunction.

Adopting this idea to all the logical constants of the intuitionistic language of frst order predicate logic, the meanings of their compound sentences are explained by telling how arguments for the assertions of them have to look to be in canonical form What must be told for diferent cases of compound sentences is what the immediate subargument or subarguments of an argument Π in canonical form are to consist of. In the case of:

∧, they are to consist of an argument for and an argument for ;

∨, it is to consist of an argument for or an argument for ;

 → , it is to consist of an argument for from , where free occurrences of the assumption may be bound by the last inference of Π;

∀ (), it is to consist of an argument for (), where x is being bound by the last inference of Π;

∃ (), it is to consist of an argument for () for some term .

This has to be completed by telling what constitute arguments in canonical form for assertions of atomic sentences. In the case of the atomic sentence ⊥, the explanation

<sup>21</sup> A term earlier used by Brouwer in a diferent way. Its use in the above sense was proposed by Prawitz (1974) and Dummett (1975).

<sup>22</sup> Although Gentzen never developed his ideas about meaning more precisely, it is clear that he was thinking in this way when remarking: the assertion of " → attests (German: dokumentiert) the existence of a derivation of from ". (Gentzen, 1935, p. 189)

is that there is no argument in canonical form for the assertion of ⊥; there are no ⊥-introductions. I shall assume that the meanings of all atomic sentences have been explained by telling what the arguments in canonical form for the assertions of them are.23 One may want to vary what is to count as canonical arguments for atomic sentences of a language. Let us say that a base for a language L specifes the canonical arguments for the atomic sentences and the set of closed individual terms of L. Validity of inferences can then be relativized to such bases.

To extend Gentzen's idea to other languages, one must thus be able to specify the meaning of each diferent sentence form by giving introduction rules for that sentence form or, in other words, by stating what constitute arguments in canonical form for the assertions of such sentences.24 It is an open question to what extent this is possible,25 but there is no problem to give introduction rules that are adequate for the logical constants understood classically.26

To summarize the meaning theoretical view adopted here, we can say more generally: To know the meanings of the sentences of a language is to know


Knowledge according to clause 1 also determines introduction rules for all forms of sentences. The validity of all their instances is implied trivially by the heuristic idea stated in the previous section. Thus, we get:

#### *Principle 3. Validity of introduction rules*

All instances of introduction rules are valid. In particular, all instances of Gentzen's introduction rules for intuitionistic predicate logic are valid.

26 For instance, a possible introduction rule for classical disjunction is displayed here:

$$\begin{array}{c} \left[ \neg \alpha, \neg \beta \right] \\\\ \frac{\bot}{\alpha \lor \beta} \end{array}$$

<sup>23</sup> An early extension of Gentzen's idea to atomic sentences is due to Martin-Löf (1971). He took Peano's frst and second axiom for natural numbers as two introduction rules for sentences of the form , one allowing the inference of 0 from no premisses, and the other allowing the inference of from the premiss ( standing for the successor operation).

<sup>24</sup> I have left open here general requirements that should be put on meaning explanations to guarantee for instance that they are not circular. They correspond to requirements that introduction rules are to satisfy discussed by Dummett (1991).

<sup>25</sup> If the language contains sentences with empirical content, we may have to broaden the concept of inference and think of introduction inferences as transitions not only from assertions to other assertions but also from other acts such as observations; they deliver what is commonly called direct evidence and may be seen as meaning constitutive.

and being schematic letters for sentences. We could have a language that contains both classical and intuitionistic logical constants, kept distinct by, e.g., attaching diferent subscripts to them, and formulate introduction rules for all of them; see further Prawitz (2015a).

#### **9 A precise sufcient condition for an inference to be valid**

A more challenging problem is to give precise principles for how inferences can be valid in virtue of the meaning of the involved sentences without being meaning constitutive in the way the introduction rules are according to the previous section. The question is whether and to what extent we can state precise principles for such validity guided by the heuristic idea of Section 7.

Gentzen meant that the elimination rules (E-rules) of his system of natural deduction were valid because of their relation to the meaning constitutive introduction rules (I-rules). He suggested, "It should be possible to display the E-inferences as unique functions of the corresponding I-inferences, on the basis of certain requirements".27 As a step in that direction and to explain why the elimination rules are valid, I have described them as the inverses of the corresponding introduction rules in the sense of satisfying the following:

#### *Inversion principle*

If the last inference of an argument Π for from Γ is an E-inference whose major premiss is the conclusion of an I-inference, the argument for the major premiss thus being in canonical form, the immediate subarguments of Π already "contain" an argument for from Δ ⊆ Γ.28

The expression "contain", which was left undefned when I frst stated the principle, can be defned as follows: Let us say that the argument Π is immediately extracted from the set of arguments when either


#### *Defnition 2. Containment*

The argument Π is *contained in* a set of arguments if and only if there is a sequence Σ1, Σ2, . . . , Σ, of arguments such that Π = Σ and for each Σ ( ≤ ), Σ is immediately extracted from ∪{Σ | < }.29

We note the following corollary:

#### *Corollary 5.*

An argument immediately extracted from a set of valid arguments is valid. A fortiori, an argument contained in a set of valid arguments is valid.

The corollary simply combines principles and corollaries stated in Section 7: For the extraction used in clause (i) see Corollary 2, for the one used in clause (ii) see in addition Principle 2, and for the one used in clause (iii) see in addition Corollary 3.

<sup>27</sup> Gentzen (1935, p. 189).

<sup>28</sup> Prawitz (1965); the formulation there refers to natural deductions in Gentzen's system instead of arguments as above, otherwise its content is the same.

<sup>29</sup> Due to Peter Schroeder-Heister and me; see Prawitz (2019a).

All instances of Gentzen's elimination rules satisfy the inversion principle when containment is defned as above. We can strengthen the principle in two ways, namely by requiring in clause (i) of the defnition of containment that Π is an argument of or is an *immediate* subargument of an argument of and by requiring that the argument for the conclusion is not only contained in the set of arguments for the premisses but can in fact be obtained from the set by one immediate extraction. Let us call the result the *strong inversion principle*. It too holds for all instances of Gentzen's elimination rules:

#### *Fact about E-rules*

Gentzen's elimination rules satisfy the strong inversion principle.

To illustrate by an example, consider the case where an ∃-introduction is immediately followed by an ∃-elimination. It has the form

$$\begin{array}{c c c} & \quad & (1) \\ \Pi & [A(\mathbf{x})] & \\ \underline{A(t)} & \Sigma(\mathbf{x}) & \\ \underline{\exists \mathbf{x} A(\mathbf{x})} & \mathcal{B} & (1) \\ \hline \end{array}$$

Another argument for from no more assumptions is obtained by two operations: frst substitute for in Σ() and then form the composition of Π and Σ() where Π replaces all free assumptions () in Σ() (the corresponding assumptions () in Σ() were bound by the considered application of the instance of ∃-elimination). The result of carrying out these two operations is an immediate extraction in the strengthened sense from the set of the immediate subarguments of the argument displayed above.

The elimination rule for ⊥ is a somewhat special case. Any instance of that rule satisfes the (strong) inversion principle vacuously since according to the meaning of ⊥ it has no canonical argument.

Invoking the heuristic idea of Section 7 and applying Corollary 5 and the above fact, we can now state

#### *Principle 4 (initial part). Validity of elimination rules*

All instances of Gentzen's elimination rules for intuitionistic logic are valid.

Is this principle really implied by the heuristic idea on the validity of inference? Consider an arbitrary instance of an E-rule. Call this generic inference G. To show that G is valid in accordance with the heuristic idea, we have to show that in virtue of the meanings of the involved sentences it appears directly that, for any application of G to a set of valid arguments resulting in a closed argument Σ, there is a proof of the fnal conclusion of that argument.

To this end we may say the following. Let be the fnal conclusion of Σ, let be the premiss of the last inference of Σ that corresponds to the major premiss of G ( is thus the result of carrying out on the major premiss of G the substitution (if any) that yields the application in question), and let Π be the subargument of Σ determined by . Since the inference does not bind anything in the argument for the major premiss, Π is a closed argument, and is thus a proof since it is assumed to be valid. In virtue of the meaning of , there is a proof Π <sup>∗</sup> of in canonical form. Replacing Π by Π ∗ in Σ we have an argument Σ ∗ for on which the strengthened Inversion Principle has bearing. Hence by a suitable extraction in the strengthened sense from the set <sup>∗</sup> of arguments resulting from by replacing Π by Π ∗ another argument for appears. By Corollary 5, this argument for is valid and hence it is a proof of .

There are three steps in this little piece of reasoning. As already said, the frst step, which gives the argument Π ∗ for in canonical form, is immediate in virtue of the meaning of . In the third and last step we use Corollary 5, an immediate consequence of Principles 1 and 2, which only make explicit basic intuitions about inferences and arguments. What can be questioned is that the appearance in the intermediate step of the argument for by an immediate extraction is sufciently direct. Admittedly a person may know the meanings of the involved sentences and have the intuitions made explicit in Principles 1 and 2 without realizing that there is this operation of extraction yielding an argument for .30 Nevertheless only knowledge of the meaning of the major premiss of the inference and a few refexions are needed to recognize that there is a canonical proof Π <sup>∗</sup> of and a proof of the fnal conclusion of Σ that uses no other inferences than those already occurring in Π <sup>∗</sup> or the arguments for the other premisses (if any).

The elimination rule for ⊥ is again a somewhat special case. In virtue of the meaning of ⊥, there is no application of an instance of that rule to a valid argument for ⊥ resulting in a closed argument, since there is no argument for ⊥ in canonical form and hence no closed valid argument for ⊥. What has to be shown for all such applications according to the heuristic idea is thus vacuously satisfed directly.

To generalize the Inversion Principle so that it concerns inferences in general, not only elimination rules, and holds more generally for languages given with other introduction rules than Gentzen's, we have to take into account that the inferences whose validity we want to establish may not have one major premiss that can be referred to in the way we did in the statement of the Inversion Principle. Instead of referring to one premiss as the major one, some of the premisses that will play a similar role as the major premiss will be distinguished and will be identifed by their ordinal numbers; we shall thus be speaking of the :th premiss of an inference.31

A particular feature of a generic inference G that satisfes the Inversion Principle is that the following holds for any application of G to arguments among which the one for the major premiss is in canonical form: another argument for the fnal conclusion of the result of the application can be obtained whose inferences occur already in the arguments to which G is applied or are substitution instances of such inferences. Instead of applying G, one can therefore argue for the conclusion by applying those

<sup>30</sup> This is a point stressed by Cozzo (2021), who draws the conclusion that the validity of an elimination inference is synthetic and non-meaning-involved.

<sup>31</sup> This means that I am using the ordering of the premisses of an inference, which otherwise is without signifcance with respect to their identity (see parenthetical remark in Section 3). In another generalization of the inversion principle proposed by Schroeder-Heister (1983) the idea to distinguish some of the premisses of an inference is crucial in a similar way; instead of referring to them by ordinals he marks them by asterisks.

inferences, which have already been used essentially and have thus been accepted implicitly as valid. What can be argued for by such applications of G can thus be argued for by using already available inferences. It seems therefore ftting to call inferences that share this feature with those that satisfy the Inversion Principle *non-creative*.

To begin with I shall restrict this notion to inferences whose applications to a set of arguments are seen to enjoy this feature of non-creativity by extractions from in the same way as for inferences satisfying the Inversion Principle. It is defned as follows:

#### *Defnition 3. Non-creative inferences*

A generic inference G is *non-creative* when it has a number of distinguished premisses at which no binding occurs and it holds for any closed argument Π resulting from an application of G to a sequence of arguments such that the ones with the same ordinal numbers as the distinguished premisses are in canonical form that an argument for the fnal conclusion of Π is contained in .

With the major premiss as the distinguished one, Gentzen's elimination rules are thus non-creative. So is the generic inference represented by the inference fgure

$$\begin{array}{c c c} A & A \to B & B \to C \\ \hline & C \end{array}$$

with → and → as the distinguished premisses. Instances of the rule of explicit substitution considered at the beginning of Section 8 are examples of non-creative inferences with an empty set of distinguished premisses.

We can now state Principle 4 in a more general form:

*Principle 4 (completed). Validity of non-creative inferences* All non-creative inferences are valid.

The reasoning to see that the principle is implied by the heuristic principle is essentially the same as that for the frst part of the principle stated above, except that we are stretching the requirement of immediate appearance since the argument for the fnal conclusion of the result of applying the inference may now appear after a series of extractions instead of just one.

It is to be noted that a generic inference G may be vacuously non-creative because there is one or more distinguished premisses for which it holds that there is no application of G to a sequence of arguments such that the ones with the same ordinal number as the distinguished ones are in canonical form. But this can only happen in case no argument in canonical form has been specifed in the explanation of the meaning of the distinguished premisses, as is the case for ⊥. In such a case G is also valid according to the heuristic idea since in virtue of the meaning of the involved sentences the condition stated by the heuristic idea is vacuously satisfed.

One may contemplate extending the term non-creative to inferences that enjoy the feature of non-creativity without this necessarily being seen by making extractions; in other words, conclusions of applications of these inferences to certain sets of arguments have again arguments using only inferences occurring in or substitution instances of them but they can be freely combined and do not need to occur in combinations obtained by extractions from .

#### **10 Concluding remarks**

On the basis of an initial discussion of the nature and aim of inferences and inspired by Gentzen's idea about how the meanings of logical constants are determined, we have arrived at some precise principles about the validity of inferences. According to them all instances of the inference rules of natural deduction for intuitionistic logic are valid. The result can be seen as a proposal for how Gentzen's idea is to be understood in order to yield such a result.

A question that naturally arises is whether all non-creative inferences expressible in the language of intuitionistic predicate logic, LIPL, are derivable in that logic, IPL. Whether an inference is non-creative may depend on the base (Section 8); as will be exemplifed below, induction inferences are non-creative given a certain condition on the base, although they are of course not derivable in IPL. We should therefore relativize non-creativity to a base and ask whether inferences expressible in IPL that are non-creative relative to all bases is derivable in IPL. This is a precise logical question about a kind of completeness of intuitionistic predicate logic that may be possible to answer.

A more philosophical and vaguer question is whether the heuristic idea about the validity of inferences implies the validity of inferences in the language of intuitionistic predicate logic above what follows from Principles 3 and 4. Here the validity should also be relativized to a base. The question is thus whether inferences expressible in LIPL that are valid relative to all bases according to the heuristic idea are so in force of Principles 3 and 4. A positive answer to this question would be a reason for identifying the logically valid inferences in LIPL with inferences that are either non-creative or are instances of introduction rules.

Going outside of logic, it is of interest to consider the rule of mathematical induction. If the intended individual domain is the set of natural numbers, it may be given the form

where is a schematic letter for individual terms and is the successor operation (since may be replaced by a free variable in instances of the rule, the rule has the whole strength of mathematical induction, allowing us to infer ∀ ()). Provided that the closed individual terms of the language in question consist of numerals, all instances of this inference rule are easily seen to be non-creative, and they are hence valid according to Principle 4. However, they cease to be so when there are closed individual terms other than numerals.

If the intended individual domain contains other elements than natural numbers, the rule of induction has to be qualifed by adding ( for the predicate to be a natural number) as a third premiss. In this form the rule has instances that are not non-creative. Since they should of course be counted as valid, going outside of logic

one needs to fnd an extension of Principle 4 if one is to cover by precise principles inferences that we consider valid.

**Acknowledgements** I am grateful to Cesare Cozzo, Valentin Gorenko, Per Martin-Löf, Luiz Carlos Pereira, Antonio Piccolomini d'Aragona, and Peter Schroeder-Heister for their commenting on an earlier version of this essay, which has helped me to improve several passages. I have also beneftted from remarks made in the discussion after my presenting part of this essay at a Nordic Online Logic Seminar in March 2021.

#### **References**

Boghossian, P. (2014). What is inference? *Philosophical Studies* 169 (1), 1–18.


Tichý, P. (1988). *The Foundations of Frege's Logic*. Berlin: De Gruyter.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Kolmogorov and the General Theory of Problems**

Wagner de Campos Sanz

**Abstract** This essay is our modest contribution to a volume in honor of our dear friend and fellow logician Peter Schroeder-Heister. The objective of the article is to reexamine Kolmogorov's problem interpretation for intuitionistic logic and the basics of a general theory of problems. The task is developed by frst examining the interpretation and presenting a new elucidation of it through Reduction Semantics. Next, in view of Kolmogorov's intentions concerning his problem interpretation, Reduction Semantics is employed in an brief epistemological analysis of Euclidean Geometry and its construction problems. Finally, on the basis of the previous steps, some theses are raised concerning intuitionistic logical constants and concerning proofs and hypotheses in Euclid's *Elements*.

#### **1 Introduction**

The epistemology of mathematics has gone through an important change in the last centuries: from focus in problems to focus in theorems. It is time for the pendulum to move back. A landmark was *The Foundations of Geometry* (Hilbert, 1899) which transformed *Propositiones* of Euclid's *Elements* into theorem assertions, while a good amount of them where the statement of a problem. This is the case of the frst three *Propositiones* in book I.

The paper investigates *Kolmogorov's problem interpretation*. The objective is to unfold the basics of a general theory of problems, answering some very simple questions like: what are problems, why they are not assertions and how problems in geometry were structured.

Kolmogorov (1932, p. 151) states his purposes as follows:

Along with the development of theoretical logic, which systematizes the schemes of proofs of

Wagner de Campos Sanz

Faculdade de Filosofa, Universidade Federal de Goiás, Goiânia, Brazil, e-mail: wsanz@ufg.br

theoretical truths, it is also possible to systematize the schemes of solution of problems, for example, geometrical construction problems [. . .] The second section in which the general intuitionistic presuppositions are accepted, presents a critical analysis of intuitionistic logic. It is shown that this logic should be replaced by the calculus of problems, since the objects under consideration are in fact problems, rather than theoretical propositions.

Years later, in a comment about his paper, Kolmogorov evaluated it as follows (Tikhomirov, 1991, p. 452):

*On the interpretation of intuitionistic logic* (IIL) was written with the hope that the logic of solutions of problems would later become a regular part of courses on logic. It was intended to construct a unifed logical apparatus dealing with objects of two types — propositions and problems.

Kolmogorov (1932) belongs to a context where the nature of intuitionistic logic was being discussed. A crisis concerning some basic mathematical concepts had opposed Hilbert's school with his foundational program for mathematics and Brouwer's intuitionistic school. Intuitionists rejected certain means of proof in traditional mathematics like the proof of existential sentences over infnite collections by deducing and absurd from a negated universal hypotheses (*reductio*), as also the use of third middle excluded principle. Since the criticisms were directed to logical principles, it became important to elucidate what would be an acceptable intuitionistic logic. Heyting (1930) is now assumed as the standard formulation of intuitionistic logic. Kolmogorov (1932) is a later publication where the issue of how to interpret logical constants was still being debated. It is considered as a part of the so called BHK1 interpretation of intuitionistic logical constants.

In the quotes above Kolmogorov remarks that logic has been historically occupied with schemes of theoretical truths. Next, he states some surprising claims. First, there is the claim about logic being occupied with schemes of solutions of problems, with geometrical constructions as a relevant example. Second, the claim that intuitionistic logic should be replaced by a calculus of problems. Third the statement about the unifed logical apparatus for dealing with objects of two distinct types: propositions and problems. As declared, he had hopes that the logic of solutions of problems could become a regular part of courses on logic. Kolmogorov's problem conceptualization has usually been explained through the paradigm of propositions-as-types (Coquand, 2007). Here, a diferent elucidation will be developed.

The investigation of Kolmogorov's problem semantics explores each of these claims. His problem semantics is going to be reformulated as Reduction Semantics. The adequacy of this elucidation is dealt with in two separate steps. First, the formulation of the semantics and the issues of soundness and of completeness for intuitionistic logic are examined in the Section 2. Second, a partial anatomy of Euclid's Geometry by means of Reduction Semantics trying to unfold the schemes of solutions of problems in geometrical constructions is investigated in Section 3. To these two sections is added a third section focusing two specifc subjects. In one

<sup>1</sup> Kolmogorov's interpretation is the third element in what has been conventionally called BHK interpretation of intuitionistic logical constants. The abbreviation corresponds to the initials of Brouwer, Heyting and Kolmogorov.

hand, the questions of what is a logical constant and if the usual set of intuitionistic propositional logical constants should be extended or not. On the other hand, a partial discussion concerning the role of hypotheses inside geometry.

Reduction Semantics is the further development of an initial investigation of problem interpretation in de Campos Sanz (2012). Reduction semantics turns to be similar to Hypo Semantics (de Campos Sanz, 2019) with a new dressing, since the basic objects are now problems.

#### **2 A semantics of problems**

#### **2.1 Kolmogorov on problems**

Kolmogorov's article is schematic and it does not contain an explanation of what is a problem, only a few carefully chosen examples (Kolmogorov, 1932, pp. 151–152):


According to Coquand (2007, § 2.5), the perspective of interpreting intuitionistic logic as a calculus of problems is an important antecedent of a later distinction: that between formulae-as-types and -terms which would then correspond to a separation of problems and solutions, respectively. Although this can be a productive way of interpreting Kolmogorov about the semantics of problems, it is not clear that this author would subscribe such a sharp separation.

Kolmogorov (1932, p. 151) characterized his semantics of problems for logical molecular operators as follows:

If and are problems, then ∧ denotes the problem "solve both problems and ", while ∨ stands for "solve at least one of the problems and ". Further, ⊃ is the problem [(FIRST)] "given a solution to problem , solve problem " or, which is the same, [(SECOND)] "reduce the solution of problem to the solution of problem . [. . .] denotes the problem "assuming that there is a solution to problem , derive a contradiction".2

In the fve problems quoted above, the main verb is imperative. They are to be considered as the command of an action. The semantic explanations, in their turn, notoriously contain the command: "solve". Implication is the only exception. It has two explanations: (FIRST) and (SECOND). And the author regards them as equivalent. The expression "solution" is used in both explanations for naming, and the expression "reduce" in the (SECOND) explanation is used for commanding.

<sup>2</sup> Square brackets were added by us.

Some assumed that the problem interpretation required two diferent representational structures for problems and solutions. But there are reason for adopting another point of view, given that the inner nature of problems and solutions seems to be the same: they are about actions.

Consider Kolmogorov's description of what is a conjunction. Given any two problems and , ∧ is the problem "solve both problems and ". The action "solve" being commanded is iterated in order to form the description of a problem. When there is plain knowledge of how to proceed in order to bring about an action being commanded, then solving the problem corresponds to do the required action. Hence, in case and are problems that one knows how to solve immediately, and problem ∧ represents the problem of solving and solving , then to solve ∧ consists in doing what solves and doing what solves .

An example might help to clarify the point. Consider the following simple problem: construct a circle *BCD* of center with radius and construct a circle *ACE* of center with radius . This is a conjunctive problem. Employing Kolmogorov's description, it denotes the problem of solving both problems, that is, solve the problem of constructing *BCD* and solve the problem of constructing *ACE* as required. Both problems are immediately solvable. The solution consists in doing the construction of the circle *BCD* and doing the construction of the circle *ACE*.

Concerning implication now, it is obscure how the semantics could be formulated by using the command "solve" in the antecedent. By saying "frst solve , then solve " another logical connective distinct of the usual implication seems to be employed once "solve" is an action and the expression "frst solve . . ., then solve . . ." indicates a temporal ordering for the actions. Actually, in order to establish an intuitionistic implication, if we start by solving , ⊃ can be considered solved, but this violates the meaning explanation using the word "solve" which is truly a before-after conjunction. We will come back to this question later.

In the reverse direction, an attempt to employ the expression "solution" for explaining conjunction and disjunction would require the use of another command verb in place of "solve", as in: "produce the solution of . . ." or just "do . . .". This seems to be no real progress. Thus, a strange heterogeneity appears in the above semantical explanations. In one side we have one kind of explanation for disjunction and conjunction employing the verb "solve", in the other side another kind for implication employing the concept "solution" and/or the verb "reduce".

Actually, the (SECOND) explanation of implication above is partially misleading since the expression "reduce *the* solution of . . ." seems to make reference to a specifc solution. But a problem could possess distinct solutions. Observe that the expression "reduce a solution of . . ." is also unclear, since it could be understood as meaning: "for some solution of reduce it . . .". The statement would be better rendered if it were formulated as "reduce *solvability* of . . .", which in turn might bring it closer to the (FIRST) explanation. Similar issues concerning the article "the" also afect the complement part of the (SECOND) explanation. It is important to stress that the usual intuitionistic interpretation of logical constants requires, for any construction of , a corresponding construction of , in order to have a true implication.

The (FIRST) explanation of implication can be seen as a characterization where solvability/demonstrability is transmitted, similar to what happens in the case of admissible rules. Under this interpretation the expression "suppose" is to be assumed as a hidden preamble like: "[suppose] given a solution to problem . . .".

The simplest and best way for rendering the (SECOND) explanation, in our opinion, would be to explain it as "reduce problem to problem ", characterizing a reducibility junction between problems. But this is acceptable only under the proviso that a solution for must be obtainable whenever a solution for is provided. This seems the best explanation for implication from the point of view of problem semantics. It does not sufer from the issues pointed in (SECOND) and it can also be extended to other constants, as we are going to argue bellow. But it is distinct from the (FIRST).

In order to exemplify the reducibility junction, let's consider again the above example concerning circles. A new problem of fnding a point that is distant of points and as much as and are among themselves is a problem that is reducible to the problem of constructing circle *BCD* and constructing circle *ACE* as described above. That is, constructing both circles as described is a way of obtaining a solution for the problem of fnding the required point .

Concerning the notions of negation and contradiction we fnd in Kolmogorov's paper the following passages. The frst is the text of a footnote accompanying the word "contradiction" in the last quotation above, (*ibid.*, p. 151):

[. . .] ¬ should not be read as the problem "prove the unsolvability of problem ". In the general case, if the "unsolvability of problem " is considered as a completely defned notion, we only obtain that ¬ implies the unsolvability of , and not the converse assertion. If, for example, it were proved that a realization of the well-ordering of the continuum is beyond our possibilities, it would not be possible to assert that the existence of such a well-ordering implies a contradiction.

#### p. 156:

It should also be mentioned that if ⊢ is false in classical propositional logic, then the corresponding problem ⊢ cannot be solved. Indeed, in view of the earlier accepted formulas and rules of the calculus of problems, this formula ⊢ readily implies the contradictory formula ⊢ ∧ ¬.

#### p. 157:

[. . .] Brouwer suggests a new defnition of negation, namely " is false" should be understood as " leads to a contradiction". Thus, the negation of a proposition is transformed into the *existential sentence* "there exists a chain of logical inferences leading to a contradiction if is assumed to be true".

Kolmogorov does not give further explanations of what he takes negation and contradiction to mean in the context of problems. So, what we have is the confation of the concept of contradiction with the usual contradictory formula. But this formula already uses a negation as can be observed in the second quotation of the three above. We believe that in a good measure the concept was assumed as unproblematic by the author. In a similar attitude, when explaining the intuitionistic logical constants, Heyting (1956, p. 102) explicitly says that he takes it to be a primitive notion and

complements by pointing that it is easy to recognize a contradiction when we are in front of one.

#### **2.2 Reduction semantics**

We claim that the explanations of Kolmogorov can be made homogeneous by adopting a new organization of the background concepts by taking the concept of reduction as the basic relation in the semantical explanation of problem constants.

The expression Γ ⊩ will mean from now on "problem has been reduced to the multiset of problems Γ" but only if it is guaranteed that a solution of is obtained in case of having solutions whatever for each problem in Γ. This proviso concerning how to defne reduction is stated in Veloso (1984, p. 26) as a requirement for the acceptability of problem transformations.3

Based on the reduction relation, every logical constant that Kolmogorov was considering can be defned by semantical clauses, where the logical constant "⊥" represents a basic problem–the *impossible* problem.4 Negation is explicitly defned as: ¬ ≡ ⊃ ⊥. The clauses are:

Clauses for using a logical constant in the repertoire:

(∧ ): Γ, ∧ ⊩ Γ, , ⊩ ; (∨ ): Γ, ∨ ⊩ Γ, ⊩ and Γ, ⊩ ; (⊃ ): Γ, ⊃ ⊩ given any problem : (, ⊩ ⇒ Γ, ⊩ ); (⊥ ): Γ, ⊥ ⊩ always (where is a basic problem).

Clauses for using a logical constant in the focus:

(∧ ): Γ ⊩ ∧ Γ ⊩ and Γ ⊩ ; (∨ ): Γ ⊩ ∨ given any problem ((Γ, ⊩ and Γ, ⊩ ) ⇒ Γ ⊩ ); (⊃ ): Γ ⊩ ⊃ Γ, ⊩ ; (⊥ ): Γ ⊩ ⊥ given any basic problem : Γ ⊩ .

The *repertoire* of problems is in left side of the semantical symbol, and the *focal* problem is in the right side of the semantical symbol. Clauses are of two kinds: (i) those explaining the use of logical constants in repertoires, that is, fnite multisets5; (ii) those explaining the use of logical constants as a focal problem. The symbol ""

<sup>3</sup> He suggests that we should not consider arbitrary links between problems, but only those guaranteeing that solvability of is obtained from solvability of .

<sup>4</sup> Kolmogorov did not use it in his paper, we here follow a late tradition in intuitionistic logic.

<sup>5</sup> Finite sets admitting multiple copies of an element.

expresses a necessary and sufcient condition6, that is, an explanation of meaning which has *explicandum* in the left side and *explicans* in the right side.7

As an exemplifcation, the reading of the clauses for implication are presented next. Consider clause (⊃ ). As a semantical rule, it is read as follows: in order to reduce the reducibility problem ⊃ to the multiset of problems Γ it is necessary and sufcient to show that reduces to the multiset Γ ∪ {}. The reading of clause (⊃ ) is a little bit more complex. As a semantical rule, it has to be read as follows: in order to reduce problem to the multiset of problems Γ ∪ { ⊃ } it is necessary and sufcient to show that, for any given problem , on the supposition that reduces to the multiset of problems {, } it follows that reduces to the multiset of problems Γ ∪ {}.

Some expressions with logical content are unavoidable in the metalanguage, and they must be interpreted in their constructive sense. The other clauses are read in a similar way to those two just considered, with *explicans* quantifed or not.

The clauses can be seen in action in the following proof.

#### **Theorem 2.1** ⊩ ¬( ∧ ¬)*.*

*Proof* Suppose that , ⊩ ⊥. Thus, , ⊩ ⊥ ⇒ , ⊩ ⊥. Hence, for any given problem , (, ⊩ ⊥ ⇒ , ⊩ ⊥). Now, by clause (⊃ ), , ¬ ⊩ ⊥ according to the defnition of negation. By clause (∧ ), ∧ ¬ ⊩ ⊥. By the defnition of negation and by clause (⊃ ) we have fnally ⊩ ¬( ∧ ¬). □

Given any problem, we can fairly say that either it has been solved or it has not. If it was solved, then either it has been *positively* solved — when a correct solution is given to the problem — or it has been *negatively* solved — when a positive solution has shown to be impossible. The expression "to be solvable" is ambiguous and it can be employed in a strict sense or in large sense. The distinction is relevant for the discussion coming next.

Although the command "solve" could mean in a large sense either positively solve or negatively solve8, the intended reading of the expression in the case of Kolmogorov is probably the strict sense, that of positively solve. Since the notion of reduction is at hand, the command "solve" can be dismissed. That is, categorically and positively solve ¬(∧¬) is the same as to show that it has been reduced to the empty repertoire of problems. There is an infnite set of problems that can also be reduced to the empty repertoire.9 Negatively solve a problem means to show that the impossible problem has been reduced to it, like in ∧ ¬ ⊩ ⊥. An implication ⊃ is positively solved categorically when has been reduced to according to clause (⊃ ). The expression "has been reduced" means that an action took place.

<sup>6</sup> Harmony is an inherent property of explanations stated in terms of necessary and sufcient conditions.

<sup>7</sup> Thus,it is not an explicit defnition once the constants being explained occur also in the metalanguage used for stating the clauses.

<sup>8</sup> Which then means that one of two possible conclusions is expected.

<sup>9</sup> This must be taken into account when reading the clauses containing a quantifcation, i.e., (⊃ ) and (∨ )

A true simplifcation seems to be achieved in Reduction Semantics, since both verbs "solve" and "do" mentioned before can be substituted by the verb "reduce".

The fctitious impossible (⊥) problem is considered as negatively solved. Thus, since negation is defned via implication, ¬ is the problem of reducing the impossible to problem . Hence, at the same time that is negatively solved, ¬ is also positively solved by (⊃ ). The fctitious impossible problem is assumed as a basic problem itself and it is semantically characterized as a problem to which all basic problems reduce. That is, its solution is a *panacea*. Hence, the concept of contradiction originally used in Kolmogorov's paper can also be dismissed. As such, when the author says that a conditional problem is meaningless and then consider it as solved, this is equivalent to say that the antecedent problem is impossible and the whole conditional problem is reduced to the empty repertoire, since every basic problem reduces to the impossible problem by defnition.

The above clauses must be complemented by the following *structural* semantical principles making explicit the properties of the semantical relation of reduction:

(basic problem Identity) for basic ⊩ ; (Load problem) Γ ⊩ ⇒ Γ, ⊩ ; (Drop basic problem) for basic (Γ, ⊩ & Γ ⊩ ) ⇒ Γ ⊩ .

The above semantics of problems, that is, the structural principles plus the clauses is called *Reduction Semantics*. When the variables , , etc. are interpreted as sentences we call it *Hypo[thesis] Semantics* (de Campos Sanz, 2019).

**Theorem 2.2** *(i) The full (Identity) principle holds, i.e., for any ,* ⊩ *. (ii) The full (Drop) principle holds, i.e., for any ,* (Γ, ⊩ & Γ ⊩ ) ⇒ Γ ⊩ *.*

*Proof* (i) and (ii) by induction in the logical degree of problem , by using both clauses for the repertoire and clauses for the focus.10 □

Semantical variations or extensions of the above set of problem constants given by the clauses are envisageable. In some cases it might well occur that for a new constant there are restrictions for the application of full drop. We think this might well be the case of some modal operators, and this is a reason for preferring the weaker and not the full formulation of the structural principles.

Going back to Kolmogorov's paper in order to determine if the above picture fts in, notice that the author intends with his calculus ". . . to systematize the schemes of solution of problems", as those in geometrical constructions. Then next, two issues are going to be considered. The frst is the adequacy of Kolmogorov's calculus of problems with respect to Reduction Semantics. The second is the adequacy of Kolmogorov's notion of problem for describing constructions and proofs in ancient geometry. A case study already envisaged by him.

<sup>10</sup> In case the full principles were assumed from the beginning, it would be enough to possess one clause for each logical constant, once the other clause could be obtained by the use of the full principles. But, of course, in that case, we would miss the fact that the weaker structural principles are enough to formulate the semantics.

#### **2.3 Adequacy of Reduction Semantics with respect to Kolmogorov's interpretation**

The second part of Kolmogorov's paper presents a calculus of problems and describes the validation of its axioms and theorems. Here are the axioms:

$$\begin{aligned} &\text{[group] A} \\ &2.1 \vdash a \supset a \land a; \\ &2.11 \vdash a \land b \supset b \land a; \\ &2.12 \vdash (a \supset b) \supset (a \land c \supset b \land c); \\ &2.13 \vdash (a \supset b) \land (b \supset c) \supset (a \supset c); \\ &2.14 \vdash (b \supset (a \supset b); \\ &2.15 \vdash a \land (a \supset b) \supset b; \\ &3.1 \vdash a \supset (a \lor b); \\ &3.11 \vdash a \lor b \supset b \lor a; \\ &3.12 \vdash (a \supset c) \land (b \supset c) \supset (a \lor b \supset c); \\ &4.1 \vdash \neg a \supset (a \supset b); \\ &4.11 \vdash (a \supset b) \land (a \supset \neg b) \supset \neg a. \end{aligned}$$

Kolmogorov gives as an example an argument containing a general method for solving the problem 2.12, valid for any , , . That is, he is basically showing an intuitive semantical justifcation based on the problem interpretation for the validity of axiom 2.12 above:

For example, in problem 2.12, assuming that the solution of has already been reduced to the solution of , one should reduce the solution of ∧ to that of ∧ . Let a solution of ∧ be given. This means that we are given both a solution of and a solution of . By the hypothesis, we can derive a solution of from that of , and, since a solution of is known, we obtain solutions of both problems and and hence a solution of problem ∧ .

The expression "the solution of", we remember, has been dismissed as being inappropriate. The above argumentation can be reformulated as a validity proof of the above axioms with respect to Reduction Semantics.

#### **Lemma 2.3** *All axioms in the calculus of problems are valid with respect to Reduction Semantics.*

*Proof* The axiom 2.12 with all parentheses becomes: ⊢ ( ⊃ ) ⊃ ( (∧) ⊃ (∧)). It is validated as follows. (1) ∧ ⊩ ∧ by (Id); (2) ⊃ , ∧ ⊩ ∧ from 1 by (Load); (3) ⊃ , ∧ ⊩ from (2) by the necessary condition of (∧ ); (4) ⊃ , ∧ ⊩ from (2) by the necessary condition (∧ ); (5) ⊃ ⊩ ⊃ by (Id); (6) , ⊃ ⊩ from (5) by the necessary condition of (⊃ ); (7) , ⊃ , ∧ ⊩ by (Load) from (6); (8) ⊃ , ∧ ⊩ from (3) and (7) by (Drop); (9) ⊃ , ∧ ⊩ ∧ from (4) and (8) by the sufcient condition of (∧ ); (10) ⊃ ⊩ ( ∧ ) ⊃ ( ∧ ) from (9) by the sufcient condition of (⊃ ); and fnally (11) ⊩ ( ⊃ ) ⊃ ( ( ∧ ) ⊃ ( ∧ )) by the sufcient condition of (⊃ ). That is, axiom 2.12 is semantically valid. The other axioms are validated in a similar fashion. □

As remarked before, each expression of form Γ ⊩ is to be read as "problem reduces to the multiset of problems in Γ". This reading does not mention the word "solution" at all. Hence, the above proof can be read without using the word "solution".

Next, Kolmogorov presents three rules for extending the set of solved problems in the calculus. The equivalence between this calculus and the intuitionistic propositional calculus is now explicitly pointed in the passage:

[Group B] We can now formulate the rules of our calculus of problems.

1. First, we include the problems of group (A) in the list of solved problems.

2. If the list includes ⊢ ∧ , then we are allowed to replace it by ⊢ .

3. If both formulas ⊢ and ⊢ ⊃ are in the list, then we can replace them by ⊢ .

4. If ⊢ (, , , . . .) is in the list and , , , . . . are arbitrary problem functions, then we are allowed to replace it by ⊢ (, , , . . .) in the list.11

Based on the above Postulates, it is easily seen that the formal calculus does in fact guarantee the solution of the corresponding problems.

We are not going to develop this calculus further here, since all formal rules and a priori formulas above coincide with the computational rules and axioms suggested by [Heyting (1930)].12 Hence, we can interpret all formulas of this paper as problems and assume that all problems are solved.

Validity of 2, 3 and 4 in Reduction Semantics requires that the following principles be proved:

**Lemma 2.4** *The following principles hold in Reduction Semantics:*

*2.* ⊩ ∧ ⇒ ⊩ *; 3.* (⊩ *and* ⊩ ⊃ ) ⇒ ⊩ *;*

*4.* ⊩ ⇒ ⊩ [ ,,,... ,,,... ]*, where* [ ,,,... ,,,... ] *is the result of substituting* , , , . . . *by* , , , . . . *inside , respectively.*

*Proof* Principle 2 is obtained by the necessary condition of clause (∧ ) ⊩ ∧ ⇒ (⊩ and ⊩ ) and, *a fortiori*, ⊩ ∧ ⇒ ⊩ . Principle 3 is obtained by using clauses full (Id) and (⊃ ). Principle 4 holds in virtue of the clauses being closed under homogeneous substitution of problems. □

**Theorem 2.5** *Kolmogorov's calculus of problems is sound with respect to Reduction Semantics, that is:* ⊢ ⇒ ⊩ *.*

*Proof* By Lemmas 2.3 and 2.4. □

There are reasons to believe that the usual set of propositional intuitionistic logical constants is incomplete for dealing with problems. It probably should be extended with other constants, among them the before-after conjunction. If this is correct,

<sup>11</sup> We suspect that the word "replace" employed in 2, 3 and 4 might be a poor translation of the original, since "add" would be a better way to formulate it. A question that might be raised is: does the passage suggest that "problem" and "function" are related concepts? We do not think so. We interpret the passage as just stating that , , , . . . are variables for problems which can be substituted by problems.

<sup>12</sup> The square brackets substitutes the bibliographic note [1] in the original paper.

then completeness is not a so meaningful property. The before-after conjunction will be used and discussed ahead when problems in geometry are discussed. Finally, we notice that Kolmogorov did not give a completness proof, probably because his semantics was formulated in an intuitive way.

Completeness can be shown if the set of logical constants is restricted to ∧, ∨, ⊃, ⊥. To say that Kolmogorov's calculus of problems is complete with respect to Reduction Semantics means that ⊩ ⇒ ⊢ . The recipe for proving it is the following. First, defne a sequent calculus with rules for the semantic principles of reduction — full (Id), (Load) and full (Drop) — and double inferences, that is, left to right and right to left inferences corresponding to the following clauses: (∨ ), (∧ ), (⊃ ) and (⊥ ). Second, prove that each of the semantical clauses and principles of Reduction Semantics is a valid metatheoretical property of this calculus, that is, substitute the symbol "⊢" in place of the semantical symbol "⊩" and prove each Reduction Semantics principle, be it a clause or a structural principle, as a metalanguage property of the calculus just defned.13 This is almost immediate. Third, prove that the calculus just described is theoremhood-equivalent to Kolmogorov's calculus of problems.

The meaning of problem logical constants was explained in Reduction Semantics through two distinct uses: in the repertoire and in the focus. The fact that the full (Drop) principle as the full (Identity) principle holds for the set of logical constants given in the clauses of the semantics above does not mean that it will hold for any extension of the set of logical constants.

Reduction Semantics is our way to elucidate, clarify, the problem interpretation of intuitionistic logic as proposed by Kolmogorov.

#### **2.4 Why problem semantics and intuitionistic logic?**

Problem semantics was the conceptual way in which Kolmogorov interpreted intuitionistic logic. We think that there are two reasons why Kolmogorov adopted such approach.

The frst is the fact that *tertium non datur* (⊢ ∨ ¬) is not a valid principle in the calculus of problems and in intuitionistic logic. Kolmogorov (1932, p. 156) interprets the sign "⊢", diferently from Heyting, as meaning generality:

For a function (, , , . . .) of undefned problems , , , . . . we simply write ⊢ (, , , . . .) instead of () () () . . . (, , , . . .). Hence, (, , , . . .) denotes the problem "*fnd a general method for solving the problem* (, , , . . .) *for each individual choice of the problems* , , , . . .*"*.14

His interpretation is constructive in the measure it considers the problem of proving logical formulas as requiring a general method of solution. In this sense intuitionistic logic is a part of a theory of problems. In particular, ⊢ ∨ ¬ would be valid if

<sup>13</sup> Observe that (⊃ ) is the semantical version of deduction theorem.

<sup>14</sup> Our emphasis.

() ( ∨ ¬) were valid, which means that we should have a general method for solving problems of form ∨ ¬ for each individual choice of .15

The observation concerning the requirement of a method of solution brings us to the second reason. It concerns the interpretation of existential propositions, when the proof of existence does not exhibit the object. The quotation is curious since the problem calculus presented is considering only propositional constants (Kolmogorov, 1932, p. 157):

Brouwer does not, however, intend to exclude existential propositions from mathematics completely. He only explains that an existential proposition should not be stated without presenting the corresponding construction. At the same time, according to Brouwer, an existential proposition is not a mere indication of the fact that we have already found the desired element of . In this case the existential proposition would be false prior to the invention of the construction and true after that. Thus, propositions of a completely new type arise, which, although their content does not change in time, can nevertheless be stated only under certain conditions.

The natural question which can arise is whether this specifc type of proposition is a mere fction. Indeed, the problem "fnd an element of a set possessing a property " is posed. This problem actually has a certain sense independent of the state of our knowledge. If this problem has been solved, that is, if the corresponding element is found, we obtain the empirical proposition "our problem is now solved". Thus, Brouwer's existential proposition is partitioned into two elements: an objective component (problem) and a subjective component (its solution).

Intuitionists do not accept [classical] negation of an universal to be a basis for the inference of an existential judgement involving an infnite collection. According to the conceptual background set by Kolmogorov, the meaning of an existential problem cannot depend on the possession of a construction exhibiting the required element because a previous understanding of the problem is needed in order to even look for the solution. Thus problems have an objectivity that solutions might lack. Problems have to be understood in order to be solved, they must be meaningful. Solutions require ingenuity in order to be obtained, and meaning cannot be made to depend on them.

#### **3 Problems and solutions in geometry**

After ofering an analysis of Kolmogorov's problem interpretation in terms of Reduction Semantics, it is time to consider more closely the concepts of problem and solution taking as a reference for the analysis that piece of mathematical knowledge where problems seem to play a central role: Euclidean Geometry.

<sup>15</sup> Kolmogorov (1932, p. 156): "formula [⊢ ∨ ¬] reads as follows: fnd a general method which for any problem allows one either to fnd its solution or to derive a contradiction from the existence of such a solution." Since a formalized language has to be previously presented before defning its semantics, all basic problems in a calculus of problems would also be exhaustively enumerated beforehand. The only hope of validating *tertium non datur* principle occurs when language is such that we possess a general method for solving all basic problems enumerated on it. For a non-specifc language such a general method is obviously impossible.

#### **3.1 Problems and practical principles**

The frst three Postulates of Euclidean Geometry have been regarded historically as practical principles. There is a modern tradition culminating in Kant going in this sense (Critique of Pure Reason, A234/B287, our emphasis):16

Now in mathematics *a postulate is the practical proposition* that contains nothing except the synthesis through which we frst give ourselves an object and generate its concept, e.g., to describe a circle with a given line from a given point on a plane; and *a proposition of this sort cannot be proved*, since *the procedure that it demands is precisely that through which we frst generate the concept of such a fgure*

On Kant's interpretation, the same procedure that is going to generate a circle under the Postulate I.3 is also the procedure behind the defnition of what is a circle. And it is fair to suppose that he holds a similar opinion about straight-lines. Each of the frst three Postulates involves a practical proposition establishing some of the most basic elements of geometry. Although his description of what is a postulate can be understood for the frst three, it is not clear that the same holds for the fourth and the ffth Postulates.

Actually, *problemata* were seen by Kant as practical propositions too (*Logik Hechsel*, 1992, p. 88):

[. . .], and problemata, practical propositions which require a solution.

Moreover (*Jäsche Logik*, 1800/1904, § 38, our emphasis):

A postulate is a practical, immediately certain proposition, or a principle that determines a possible action, in the case of which it is presupposed that the way of executing it is immediately certain. Problems (*problemata*) are demonstrable propositions that require a directive, or ones that express an action, the manner of whose execution is not immediately certain. [. . .] Note 2: A problem involves (1.) the *question*, which contains what is to be accomplished, (2.) the *resolution*, which contains the way in which what is to be accomplished can be executed, and (3.) the *demonstration* that when I have proceeded thus, what is required will occur.

Beyond any doubt, the treatment of problems as practical questions had already been considered in the history of philosophy. And the canonical examples of problems Kant is refereeing are those in geometry. He underlines three elements involved in problems in the footnote accompanying the quotation above. This is a good starting point for considering the nature of problems. They involve: (1) the question, or what is asked; (2) the resolution seen as something that is going to be executed, (3) a proof that the resolution accomplishes what was asked. The frst two elements are common to all problems. The third is characteristic of mathematical problems. These three elements are present in many of the geometric *Propositiones* in Euclid's *Elements*, mainly in those asking a construction like I.1, for example.

<sup>16</sup> See Lassalle-Casanave (2019).

#### **3.2 What is a problem?**

As remarked before, Kolmogorov did not answer the question of what is a problem. We can try to organize some ideas departing from Kant's considerations.

Many of the Euclid's *Propositiones* are construction problems. One example is problem III.1: *to fnd the center of a given circle*.17 According to Defnition I.15, a circle is a plane fgure contained by a single line, the circumference, such that all of the straight-lines radiating towards the circumference from one point amongst those lying inside the fgure are equal to one another. Thus, historically at least, problem III.1 is not the problem of showing the existence of the center of the circle, since this is already guaranteed to exist according to Euclid's defnition of a circle. Also, strictly speaking, there is no construction to be efected, once the circle is already given. The problem for which a solution is then asked is that of fnding or determining the center when this is not clear from the fgure given, supposing it to be a circle.

**Fig. 1** Prop. III.1

By a series of actions the center of a circle can be found. Why do we say actions? Because the problem is to be solved in the end by employing the frst three Postulates which, as Kant pointed, are practical principles. Indeed, the starting step in providing a solution to problem III.1 consists in the production of a chord of the circle, that is, the production of a straight-line according to Postulate I.1. And anyone of the endless chords would serve for the purpose.

Actually, the construction given as a solution to problem III.1 is only a description of which actions and in which order they should be taken for determining the center of the circle. It constitutes then a recipe of construction, or resolution in kantian terms.

What is problem III.1 about? It is about "fnding the center" and the verb "to fnd" is a verb used to describe an action. How the problem can be put to someone, a student let's say? It can be put by uttering a command or asking an action: Find the center of circle ! Only actions can be commanded. Clearly, it does not make

<sup>17</sup> All quotations of Euclidean Geometry are taken from Fitzpatrick's (2008) translation of Heiberg (1885).

sense to command a usual proposition or a sentence. We assert sentences. The two speech acts are of diferent nature.

The same terminology that Kant used for describing the content of a command or invitation involving an action is going to be employed here: *practical proposition*. A *problem* is stated by asking or commanding an action whose result and/or execution is unknown or assumed to be unknown. We can separate two aspects of a problem: one corresponding to the speech act involved in uttering the problem as such, asking or commanding; and another concerning the content or practical proposition formulating the action being asked. This distinction seems to be suggested in Kant's passages quoted above. Kolmogorov is consistent in using the concept of problem in connection with actions, but he does not make explicit the two aspects just pointed.

To formulate a problem is the same as to ask or to command an action: to fnd something, to draw something and — we have reasons to think also that — to prove something. A *solution* to a problem is in consequence a certain organized sequence of actions accomplishing the action commanded or asked. In particular, Postulates can be interpreted as problems of a very specifc nature: they are such that the action being asked is assumed to be immediately feasible or supposed to be immediately solvable. This explains why the frst three Postulates are not proved. They are the simplest problems to which others are reduced.

But Euclid's *Elements* contain other kinds of *Propositiones* that do not ask a construction. Most of them are what we could call proving problems. *Propositio* I.47 — Pythagora's theorem — is an example. This is a curious case, since I.47 could also have been stated as a construction problem asking to obtain a square equivalent to the sum of other two given squares.

Geometry, in the hands of Hilbert (1899), has received a defnite assertional turn in which construction problems as those of Euclid's *Elements* were left out. Nonetheless, the opposite movement of traveling back home might be considered, giving way to a *problemational turn*. That is, a turn in the focus of epistemology making it more sensitive to the original formulation of Euclidean Geometry.

Solutions can be divided in two kinds, which implies that problems are themselves divided in two kinds: token-problems and type-problems. For example, a problem like (1) in Kolmogorov's quotation enumerating problems, the one asking to fnd four numbers, is a token-problem. It requires a specifc action. Its solution, when there is one, involves an act of exhibition of a result or a state-of-afairs. A problem like (3) is a type-problem, since the solution expected is a recipe/algorithm showing how to fulfll what is being commanded or asked for given , and , as parameters.

Actually, all solutions can be characterized as recipes. Any token-problems can be assimilated for simplifcation to a type-problem requiring a recipe containing a fnal act of exhibition. In other terms, here the recipe is to be thought as being recoverable from the succession of actions that produced the result being exhibited.

From a philosophical point of view, the investigation of problems belongs at the same time to theoretical philosophy and to practical philosophy. A solution is a recipe describing actions which when realized will solve a problem, while a problem is equivalent to ask or command an action. This seems to be the borning star of mathematics.

#### **3.3 The nature of solutions and problems in Kolmogorov's perspective**

Reduction Semantics above has been proposed as an elucidation of the problem interpretation by Kolmogorov. In Reduction Semantics, problems and solutions are treated in an unifed fashion and this fact is going to be used for explaining bellow the relative order among Postulates and *Propositiones* in ancient geometry.

Nonetheless, there is a question that requires our attention before proceeding with the analysis of problems. The question concerns the two kinds of objects that Kolmogorov's calculus of problems should deal with: propositions and problems.

Problem (5) in the list of examples quoted above is a conditional problem asking to fnd a way to express number as a rational expression. It has in the condition position a proposition: "that the number has a rational expression". Problems are uttered with imperative mood while sentences and propositions are uttered with the indicative mood. Example (5) has a condition in the indicative mood and the conditioned in the imperative mood. No doubt, (5) states a problem and its main verb is in the imperative mood. How propositions should be dealt with in the context of problems?

As mentioned before, typed -calculus disposes of two diferent structures for representing problems and solutions in one same formalism. Formulas are types for -terms. Solutions are -terms. Problems are regarded as formulas, that is, as types. Propositions or sentences are normally considered a subset of the formulas. Did Kolmogorov intented this classifcation? We have reasons to doubt it, among other things because diferent moods are used when stating problems and propositions.

Propositions or sentences can be dealt with in the context of problem interpretation inside some specifc kinds of problems. Actually, Kolmogorov stated, although only in an implicit way, what looks like a unifying perspective. It appears when he is discussing the principle of excluded middle. Recall that for him the expression ⊢ ∨ ¬ should be read as asking to (*ibid.*, p. 156):

[. . .] fnd a general method which for any problem allows one either to fnd its solution or to derive a contradiction from the existence of such a solution! In particular, *if the problem consists of proving a proposition*18, one must have a general method which allows one either to prove each proposition or to reduce it to a contradiction.

Proving propositions then seems to be one important kind of problems and the answer to our question.

Assuming that proving propositions is a legitimate distinct kind of problems, then the concept of a problem becomes more general than the concept of theorem. Now, all *Propositiones* in Euclidean Geometry can be regarded as problems, some are mainly construction problems and others are mainly proving problems. In this sense, a calculus of problems would embrace all *Propositiones* without requiring any substantial change in the way they were formulated, instead of transforming construction *Propositiones* into existence statements or something similar.

Problems which consist in proving a proposition are going to be represented by the expression "(prove )". Next, assuming also that the expression "(deduce

<sup>18</sup> Our emphasis.

from Ω)" represents a basic kind of problem, where is a proposition and Ω a set of hypotheses, the special case where Ω is empty constitutes a way for characterizing what is to prove a proposition , that is:

(†) (prove ) ≡ (deduce from ∅).

Thus, if it has been shown that (prove ) ⊩ (prove ), then ⊩ (prove ) ⊃ (prove ), by clause (⊃ ). This last means ⊩ (deduce from ∅) ⊃ (deduce from ∅), according to (†). This corresponds to a transmission of demonstrability, similar to an admissible rule. Observe that it is distinct from ⊩ (prove if , then ) since this last means (deduce if , then from ∅) according to (†), and this last means (deduce from ).19

Verbs commanding an action were used for talking about deductions and proofs since the expression of problems and solutions require practical propositions. Deductions and proofs become then just the trace of actions conveyed in the practical proposition. Two examples in Section 3.6 illustrate the point.

After examining in a few words what might be proving problems, we can deal with problem (5) given as an example by Kolmogorov. Recall that it is formulated as :"assuming that the number has a rational expression, = /, fnd a similar expression for the number ". Two distinct interpretations seem to be possible.

First, the problem can be understood as meaning: (prove = /) ⊃ (fnd . . .). This is a solvable problem, by the mere fact that (prove = /) ⊩ ⊥. Euclidean Geometry contains conditional problems where the condition is a proving problem. *Propositio* I.22 is an example. The construction of the triangle as required depends on supposing the provability of a certain relation between the three given straight-lines, , and . Any of them must be shorter than the sum of the other two. The mere supposition that three straight-lines are in this relation does not guarantee that the procedure can be carried adequately.

Second possible interpretation, (5) could also be understood in a more general sense as being a problem to be solved under certain conditions. That is not a rational number is a fact. Nonetheless, nothing forbidden us of supposing contrary to the fact that were equal to / for and rational numbers. However, to derive a contradiction from this supposition is to misunderstand the point of the counterfactual. The condition in (5) could be interpreted as a counterfactual supposition. And, if this were what Kolmogorov intended, then the meaning of (5) would be diferent from the meaning considered in the precedent paragraph. It would be equivalent to state the following problem: fnd natural numbers and such that (/) / = −1, supposing and to be natural numbers and to be the imaginary number.

<sup>19</sup> The action of deduction can be defned by refecting inside the object language the content of the focus clauses for Reduction Semantics. They can be characterized relative to the deduction problem as follows:


**Fig. 2** Prop. I.22

We claim that a second kind of conditional problems should be considered in a calculus of problems, one in which the condition is not itself a problem, not even a proving problem. The condition involves just a description of a putative situation. There are related examples in Euclidean Geometry. One is the correctness-proof for the solution to the problem of fnding the center of a circle in *Propositio* III.1. It starts by a contrary to fact supposition: that the point is not the center. This supposition is the beginning of a *reductio* proof. It is not a supposition of having proved that is not the center, even because exactly the opposite had been just showed by the construction. We come back to this issue in Section 4.2 when discussing the relation between hypotheses and problems.

Apparently, Kolmogorov did not consider the second interpretation above with respect to example (5) or, at least, he assumed it to be resolved with the frst interpretation. And since our subject here is his calculus of problems, we put the alternative interpretation aside for the time being.

Next we turn to the analysis of solutions in ancient geometry. That is, we proceed to the second task concerning the adequacy question relative to Reduction Semantics.

#### **3.4 Problems in ancient geometry**

Now turning to Book I of Euclid's *Elements*, let's proceed with the epistemological analysis of problems on the basis of Reduction Semantics. What should be expected from it is a certain homogeneity in treating *Propositiones*, Postulates and Common Notions as problems.

Traditionally, logic has been used to ofer an analysis, even if partial, of the proof

steps and the structure of Euclidean Geometry. Some believe that since Aristotle, at least, proofs are taken to be discursive sequences of assertions or of declarative sentences. But this perspective does not correspond either with what we read in the three frst Postulates or many other *Propositiones*, in particular the frst three in Book I. Hence, it is a striking fact that the logical analysis through declarative sentences of Euclid's *Elements of Geometry* starts with an inadequate concept. When Hilbert (1899) came to light things changed, but then all *Propositiones* became declarative statements. Points, lines and planes became three distinct system of things related by declarative axioms.

Together with other authors, we think that it is possible to provide a logical analysis of ancient geometry maintaining its original formulation with all richness of its *Propositiones*. From our perspective it is Kolmogorov's merit to have devised an approach that can be fruitfully applied to ancient geometry, although it remained largely non-explored. We clearly mean his problem interpretation. So, next, the problem interpretation is employed as an epistemological tool for the analysis of the *Elements* via Reduction Semantics.

One attempt in such direction can be found in von Plato and Mäenpää (1990). The authors develop an investigation of Euclidean Geometry based on Martin-Löf's Type Theory assuming it to be an elucidation of that interpretation. They claim that the frst three Postulates can be assimilated to constructive functions. According to them, (*ibid.*,p.281):

The construction postulates lay down the permitted means of producing fnite straight-lines, [. . .]. The functionality of postulates suggests a way of rendering them into the general pattern of natural deduction rules used in intuitionistic type theory. Its inferences may be viewed as functions from premises to conclusions. This proof functionality is explicitly recorded in proof objects, that is, in the objects given in the left side of judgements of the form a:A. It is judgements, not propositions, which fgures as premises and conclusion in an inference rule.

This interpretation is further discussed and criticized in Naibo (2018). More recently, a deeper exegesis of Euclidean Geometry partially based on von Plato and Maenpaä's approach has been proposed by Sidoli (2018). But Sidoli does not focus the concept of problems or its elucidation. Also, together with von Plato and Maenpaä, he assumes that points are parameters of constructions. But there are reasons to disagree with them.

Postulate I.1 is interpreted by von Plato and Maenpaä as a constructive function producing a straight-line once two points are given as parameters, similar to a natural deduction rule. But it is doubtful that straight-lines depend on points as parameters in order to be produced. There are at least two reasons for doubting.

First, because the production of a straight-line can just start in a place whatever not determined beforehand and it can stop in a point not determined with antecedence. This is the case of the starting act of resolution of problem III.1 which consists in fnding the center of a given circle as in Figure 1. This act is that of producing chord at random for the given circle . It is a non-deterministic act. If conceived as a constructive function, it is doubtful that points and were previously established when Postulate I.1 was supposed to be applied. And nothing hinders one to have picked two points whatever over the circumference before drawing the line, similar

to what was done in the resolution of *Propositio* I.9: *let the point have been taken at random on . . .*. Most striking yet is that, after the chord is drawn, the perpendicular over the middle point in the chord is raised and, although belonging to the circumference since the beginning, point cannot be pinpointed before the perpendicular has been produced in Figure 1.

Second, Postulate I.2 — *. . . to produce a fnite straight-line continuously in a straight-line* — does not even mention points, much less a stopping point.

A mental image could be of some help here. Think about the points as being mainly the end or the start of a drawing/production. This is in accordance with Defnition I.3. "Ends", as such, have no parts. Also, without an actual drawing there is no starting or stopping points. The frst three Postulates strongly suggest such a drawing or action, although in the case of Postulate I.3 the center and the *radius* have to be provided before.

The three beginning *Propositiones* of Book I are naturally read as problems not as theorems. Their resolutions show how to fulfll what is being asked or commanded: a construction. Actually, the resolution text in the *Propositiones* contains two principal parts among others: *kastaskeue* and *apodeixis*. The *kastaskeue* contains a recipe or solution for the construction problem. It describes certain actions to be efected according to the Postulates. To draw a circle, to draw another circle, to draw a straight line, to draw another straight-line. The *apodeixis* contains a proof that this solution really produces what is demanded in the *Propositio*. Once I.1 — *. . . to construct an equilateral triangle on a given fnite straight-line* — has been resolved and the solution has been proven correct, it can be used for obtaining a solution for I.2 — *. . . to place a straight-line equal to a given straight-line at a given point (as an extremity)*. Therefore, the *Propositio* that was formerly interpreted as *a problem* is now used as part of a *solution* for a new problem.

The change in perspective means that the distinction between problems and solutions is superfcial and it depends of a certain history of accomplishments. First an action is seen as problematic. Then a procedure for efecting the action is ofered, and the problematic action now just becomes a piece of knowledge that can be employed in the solution of another problem.

Proving problems were briefy described in the previous section and they must be considered as part of Kolmogorov's calculus of problems. Hence, they are part of the language of problems considered in Reduction Semantics.

*Propositiones* similar to I.47 — i.e., Pythagoras' — are theorems. Their proofs also contain *kataskeue* and *apodeixis*. When proving I.47, similar kinds of problems are to be solved again: frst, a construction and, second, a proving problem. In fact both are intertwined and, although proofs in modern reconstructions are composed of assertions, the original geometrical proofs are never exclusively discursive since they involve practical problems: those involved in the construction.

We claim that Reduction Semantics fts the *Propositiones* of Book I the way they were formulated. Before the analysis goes on, it is necessary to efect an examination of Postulates and Common Notions within the perspective proposed.

#### **3.5 Common Notions and Postulates**

Common Notions are generic in the sense that they apply to diferent elements of geometry: lines, angles, triangles, etc. For example, in Common Notion I.2 — *if equal things are added to equal things then the wholes are equal* — the word "things" is a parametric word, which suggests that it can be interpreted as a schematic deductive rule. The same holds for Common Notions I.1 to I.4. Common Notion I.5 is an schematic axiom — *the whole is greater than the part*. Common Notions are either a proving problem assumed to be solved, as in I.5, or they are basic reduction rules as in all other cases. For example, I.2 can be rendered for things , and as: ⊩ (prove if ( = ), then ( + = + )). From it follows: (prove = ) ⊩ (prove + = + ).

Postulates are the fundamental principles governing geometry. The frst three Postulates of Book I are of a practical nature since they are about actions: *. . . to produce a straight-line from any point to any point*; *. . . to produce a fnite straight-line continuously in a straight-line*; *. . . to produce a circle with any center and radius*. We claim that they can easily and fairly be understood as problems, even if a very simple one.20 But they are problems of a special kind. They are supposed to be solvable. This supposition does not require to suppose together any description of how to proceed or which tools should be employed. The alternative of assuming them to be solved seems to be stronger and unreasonable if it involves to suppose an infnity of acts to have been efected. The frst three Postulates are rendered as:


Postulate I.5 involves the problem of determining when two straight lines are not parallel. According to Defnition I.23, straight-lines are parallels if they do not meet each other when indefnitely extended in any direction. And when another straight-line crossing both form internal angles less than two right angles, then they are not parallels. That is, the condition of making angles less than two right is sufcient for showing that they meet on the side they are less than that by indefnitely extending such straight-lines. This Postulate is then a specifc principle for straight-lines guaranteeing a solution to the construction problem of fnding the meeting point of two straight-lines when a certain condition is fulflled. This is indeed a construction problem, hence the concept of postulate as presented by Kant fts here perfectly. It becomes, for three straight-lines , and , such that cuts in the point and cuts in the point : (prove + < 2∠) ⊩ (fnd the intersection point of and

<sup>20</sup> The greek word *êitêsthô* is the frst occurring in Postulate I.1. It is an imoerative verb and it means "ask for", "demand", which is exactly what one says when stating a problem.

, by extending them from and ).21 This practical principle efects a selection of which surfaces are to be considered in geometry: only fat surfaces.

Finally, Postulate I.4 — *that all right angles are equal to one another* — is a declarative statement. It does not look like a practical principle. It can be seen as an answer to the problem: when two distinct right angles are equal? We assume it as a solved proving problem difering from Common Notions, since it is specifc, not generic. It states that any angles falling under Defnition I.1022 are equal angles. Depending on the surface being considered, this is not a trivial matter. It holds only for homogeneous curved surfaces, not for conic surfaces, for example. It is rendered, for any two angles and *DEF*: ⊩ (prove if ( is a ∠ and *DEF* is a ∠), then = ).

The proofs of the *Propositiones* achieve a problem reduction: from a more complex problem to less complex problems until the bottom — the Postulated problems and the Common Notions, in principle. Postulates and Common Notions are supposed to be solvable. And, of course, it does not make sense to ask a correctness-proof for both. Rephrasing Kolmogorov (1932, p. 151), but avoiding any commitment with the existence of solutions, and keeping in mind that problem reduction involves a specifc relation between solutions already pointed before:23

If we can reduce the solution of problem to the solution of problem and the solution of problem to the solution of problem , then the solution of can also be reduced to the solution of .

#### **3.6 Towards a general theory of problems, or how to solve it?**

Kolmogorov's point of view plunged intuitionistic logic into a general theory of problems. Efectively, there are reasons to think that the schemes of solutions for geometrical problems ofer a model of how to approach logic in terms of problems and their solutions, following the structure of Euclid's *Elements*.

Concerning a general theory of problems, Veloso (1984, p. 29) points out that problem decomposition is one of the main strategies for solving problems:24

A common approach to solving a problem is to partition the problem into smaller parts, fnd the solutions for the parts, and then combine the solutions for the parts into a solution for the whole.

We claim that logical constants give the patterns according to which problems should be decomposed and solved. From our perspective, it was Kolmogorov who

<sup>21</sup> It must be reminded that Kolmogorov's problem interpretation is silent about conditional construction problems with propositional conditions, which would be our preferred solution for this case. See Section 4.2.

<sup>22</sup> Right-angles are defned as the angles formed by the incidence of two straight lines making equal adjacent angles.

<sup>23</sup> That is, when problem reduces to problem , any solution of gives rise to a solution of .

<sup>24</sup> The original is in Aho, Hopcroft, and Ullman (1975, p. 60).

frst devised it in his problem interpretation. Together, reduction and decomposition are the two main general strategies for solving problems, and they form the core of Reduction Semantics. Decomposition and reduction are going to be illustrated next. Let's consider now how the resolution of *Propositio* I.1 takes place.

> D A B E C

**Fig. 3** Prop. I.1

The resolution of I.1 contains a construction solution part — to build the equilateral triangle over straight line — which is solved frst; and it contains in the sequence a proving solution part— to prove for the three sides of the triangle that = = .

The construction part is done in two steps. First, the two circles of radius and are drawn/produced according to Postulate I.3, and they could be done in any order. Second, and only after the frst part is done, the straight lines and are drawn/produced according to Postulate I.1, and they also could be done in any order. Drawing a circle and drawing a straight line are two problems that we assume to be solvable, since they are postulated. Conjunction (∧) is a natural way of composing actions when the order is irrelevant: (draw circle *BCD* of radius with center ) ∧ (draw circle *ACE* of radius with center ). Conjunction is used again in the composition of the subsequent actions: (draw straight line ) ∧ (draw straight line ). But, the next step requires the distinction of a before and an after, which we are going to represent by "{", a before-after conjunction:25

(‡) ⊩ {(draw circle *BCD* of radius with center ) ∧ (draw circle *ACE* of radius with center )} { {(draw straight line ) ∧ (draw straight line )}.26

At this point the construction problem is solved, a triangle was drawn/produced. What is the solution to the problem of drawing/producing triangle ? It is the composite action described in (‡). Since each action in (‡) is a problem considered to be solvable, then the whole complex structured action (‡) also shows the problem of drawing/producing an equilateral triangle to be solvable since Reduction Semantics explains how to understand each logical constant employed. Next, the solution of *Propositio* I.1 requires a verifcation showing that the triangle is equilateral.

<sup>25</sup> The before-after conjunction is expressed in programming languages by the semicolon ";". But, as a logical symbol it might cause some confusion, reason why we do not use it.

<sup>26</sup> It can be read as follow:{(draw circle *BCD* of radius with center ) and (draw circle *ACE* of radius with center )} and after {(draw straight line ) and (draw straight line )}.

Proving that the triangle obtained is equilateral is the proving problem part. For proving problems, Postulates I.4 and I.5, the Common Notions, and the Defnitions, all together, contribute to establish the basis of what is considered solvable. Hence the problem (prove = = ) is decomposed (and solved) into:

#### (#) ⊩ ((prove = ) ∧ (prove = )) { (prove = )

Defnition I.15 about circles establishes ⊩ (prove = ) as positively solvable, since both lines are radius of the same circle *ACE*. A similar reasoning holds for ⊩ (prove = ). In third place, from the two precedent equalities by using Common Notion I.1, ⊩ (prove = ) is established as positively solvable. That is, (#) makes explicit the decomposition of the problem (prove = = ) into subproblems until reaching solvable subproblems. Additionally, the expression gives the trace of the proving procedure.27 The expression (#) is the solution to the problem (prove = = ) which turns now to be considered positively solvable.

The whole fnal expression with the solution of *Propositio* I.1 is:

⊩ {{(draw circle of radius with center ) ∧ (draw circle of radius with center )} { {(draw straight line ) ∧ (draw straight line )}} { {((prove = ) ∧ (prove = )) { (prove = )}.

That is, the solution of a problem can be obtained by decomposition of a problem into subproblems until arriving to solvable subproblems and then composition of these subproblems already considered solved.28

The before-after conjunction is usually expressed in mathematics by means of function composition. If () and () are two functions, then () represents the ordered application (()). Nonetheless, the before-after conjunction can be used when functions cannot be used. In the case of geometry, it is doubtful that drawing/extending a straight-line should be considered a function depending on previous determined parametric points, since the act of drawing/extending a straightline might create, so to say, its starting and stopping points. That is, the actions corresponding to Postulate I.1 and I.2 cannot be identifed with a function, and much less with a constructive function. As remarked, I.2 does not even mention an stopping point. Although powerful, the functional interpretation does not seem to be the best tool for efecting an epistemological analysis of ancient geometry.

After the example of a construction problem, let's examine a theorem *Propositio*, the proving problem I.15: *if two straight-lines cut one another then they make the vertically opposite angles equal to one another*. The proof starts by the following sentence: (\$) *Let straight lines and cut one another at the point* . It is

<sup>27</sup> All inferential relations can be read from this trace. Of course, this is not a natural deduction tree, but such a tree can be build on the basis of the trace given.

<sup>28</sup> One item might seem to be missing in the given solution: the action of determining the intersection point . Notice that the Euclidean text contains a reference to point in both circles drawn: circle *BCD* and circle *ACE*, thus apparently bypassing the issue, but being undeterministic about which intersection to consider, reminding that there are two of them. What is seen in the fnal solution is a refection of the text in the *Elements*. If the action of determining the intersection point were added to the solution, we would be falsifying the original text.

**Fig. 4** Prop. I.15

required to show that angle *AEC* is equal to *DEB* (and *CEB* to *AED*). It is a conditional proving problem whose condition is: *if two straight-lines cut one another . . .*. The problem has to be formulated as: (deduce = from [the supposition that] cuts ). It cannot be formulated as (prove cuts ) ⊩ (prove *AEC* = *DEB*) because the antecedent problem does not seem to match the sentence (\$), "let . . ." does not mean "suppose it has been proved that cuts ". We come back to this point in Section 4.2 discussing the notions of hypothesis and assumption.

Suppose that the straight-line cuts . Hence, stands on the straight-line , making angles and *AED*. The sum of the angles and *AED* is thus equal to two right-angles according to *Propositio* I.13. That is, the problem ⊩ (deduce *CEA* + *AED* = 2∠ from cuts ) is solved. Also stands on , hence ⊩ (deduce *AED* + *DEB* = 2∠ from cuts ) is also solved by I.13 too. Next, by the Common Notion I.1, ⊩ (deduce *CEA* + *AED* = *AED* + *DEB* from cuts ) is then solved. And next, subtracting *AED* from both sides, by Common Notion I.3, ⊩ (deduce *CEA* = *DEB* from cuts ) is then solved. That is, ⊩ (prove if cuts , then *CEA* = *DEB*). End of the proof. In this case, no construction is added to the given fgure, so there is no *kataskeue*. The *apodeixis* covers the whole resolution of the proving problem and here is the fnal solution:

⊩ {{{(deduce *CEA*+*AED* = 2∠ from cuts ) ∧ (deduce *AED*+*DEB* = 2∠ from cuts )} { (deduce *CEA* + *AED* = *AED* + *DEB* from cuts )} { (deduce *CEA* = *DEB* from cuts )} ∧ {. . . { (deduce *CEB* = *DEA* from cuts )}

It must be kept in mind that this expression is a rough statement of the solution in terms of actions. For example, the action of subtraction is not made explicit, although its use has been made clear. Also the deduction relating the conditions of I.13 and I.15 are not represented above. Solutions are communicated and as such the level of detail is variable and depends on the expertise expected from the audience, if they do not sufer of syntaxism.

As observed, *Propositio* I.15 starts by: *if two straight-lines cut one another . . .*. This is the statement of a condition that selects the situation to be considered and, of

course, not every situation lies under this condition. Parallel lines do not, for example. But other confgurations like that where with lines and in Figure 5 do not either, even if they are not parallels.

**Fig. 5** Prop. I.35

Next, the condition is related to another condition, the condition in the formulation of *Propositio* I.13: *if a straight-line stood on a(nother) straight-line, . . .*. In other words, any situation satisfying the identifying condition of *Propositio* I.15 has to be already a situation satisfying the identifying condition of *Propositio* I.13.

Now, some words about the before-after conjunction constant are required once both solutions in the two examples above employed it.

#### **4 Logic and problems**

#### **4.1 Logical constants and algorithms**

The question of what is a logical constant is deeply difcult and interesting. Kolmogorov's paper touches the heart of this question since he states that his calculus of problems should substitute intuitionistic logic. He also expressed he had hoped that the schemes of solutions of problems would become an important part in courses of logic.

There are distinct competing theories of what is logic. Many times, a decision of how to interpret logic already involves a decision of what is a logical constant. Logic has been assumed to be the science of logical truths.29 However, it also has been assumed to be a science of formal deductions.30 A third possibility is to conceive it as the central part of a general theory of problems or of a general theory on problems resolution. We think that this alternative difers from the previous two and, at the same time, it extends them once they are limited to proving problems. The third alternative deserves investigation, since the problem approach allows an homogeneous epistemological analysis of items in the history of mathematics the way they were

<sup>29</sup> See Gómez-Torrente (2019).

<sup>30</sup> See Došen (1989, p. 364).

formulated, without twists. After all, the basic sources for logic theorizing are the argumentative practices historically found mainly in mathematics.

Some might wonder if the before-after conjunction could not be defned by the other intuitionist constants. The formula ∧ ( ⊃ ) seems to be the candidate that comes to mind. That it is not can be realized when asking which of the two problems or should be taken as the "frst" problem to be solved. This formula does not distinguish a frst or a second element. Indeed, a problem like ∧ ( ⊃ ) can be settled by frst solving problem and next solving problem , but this is unfaithful to the intuitive meaning of the before-after conjunction.

Examples of the use of before-after conjunction were given above. In the solution of *Propositio* I.1, the production of straight-lines and depends on having beforehand circles *ACE* and *BCD* as well as their intersection points. If the circles were not produced, there would be no intersection point for drawing the two straight-lines. Point is not determined by a constructive function over circles *ACE* and *BCD* as parameters. Only after the circles are produced, there will be two distinct intersection points. Any one can be picked as point for drawing the triangle, it does not matter which. From this perspective, is just a stopping point common to both circumferences produced in accordance with Postulate I.3.

The semantical clauses for the before-after conjunction are as follows:

In the repertoire:

$$\begin{array}{c} (\sim^l) \colon \Gamma, c \leadsto d \Vdash e \cong \text{given any problem } a \ ( (\text{first } a \Vdash c \text{, and after } a \Vdash d) \Rightarrow \\ \Gamma, a \Vdash e)). \end{array}$$

In the focus:

$$(\sim \succ^r) \colon \Gamma \Vdash c \leadsto d \cong (\text{first} \,\Gamma \Vdash c, \text{ and after } \Gamma \Vdash d).$$

The before-after conjunction is not among the usual intuitionistic logical constants, neither among the classical. Function composition has many times been used for obtaining the before-after efect. We repeatedly pointed above why we prefer to consider it a logical constant for problems. Some basic actions do not ft in the role of functions.

More important, with the before-after conjunction the logical relations in the solution of a proving problem can be recovered since this kind of conjunction clearly distinguishes the temporal order in which the actions were resolved or established, thus allowing one to obtain the trace of the proof of an assertion. The following elimination rules partially state the inferential behaviour ofthe before-after conjunction: { ⊢ and , { ⊢ . Then, clearly, { ⊢ ∧ ( ⊃ ). But the reverse is not correct as we already argued above.

The logicality of the before-after conjunction is a matter for debate. However, if Kolmogorov's problem interpretation of intuitionistic logic is regarded as a defensible position, then the absence of the before-after conjunction in the intuitionist description of the propositional logical constants would mean that this description is incomplete.

The before-after conjunction is at work everywhere in Euclidean Geometry. It corresponds to a sequential decomposition of a problem in view of producing its solution, one of the basic cases covered by the eternal strategist's *dictum*: divide and conquer. And since ancient geometry is a reference source for the concepts of proof and inference, any attempt at logic theorizing should keep a minimal *adequatio* with respect to what one can fnd in ancient texts.

Geometric proofs are the communication of how to solve certain problems, of which there are two important kinds: construction problems and proving problems. In general, the *kataskeue* contains the solution of a construction problem; the *apoidexis* contains the solution of a proving problem. But they are often mixed. In both cases what is communicated is a recipe, an algorithm. The communication of a geometric proof is then basically the communication of an algorithm. The structure of such communications has to be enough for understanding and undertaking the acts promoting evidence for a *Propositio*. Observe that the communication of an algorithm may sometimes hide evident or repetitive steps in order to make the communication shorter.

Some parts of the *Propositio* prepare or summarize the communication of the algorithm. The non linguisitc part is the accompanying diagram. It is not part of the algorithm. Diagrams exemplify the data that serve as the object modifed, produced or examined in the algorithms being communicated, according to de Campos Sanz (2021). Usually, diagrams represent the fnal state of the drawing/production and in most cases it does not contain the traces of the actions comprised in the recipe as pointed by Sidoli (2018).

We next consider one particular item found on this argumentative practices: the use of hypotheses/suppositions in the solution/proof communications occurring in Euclid's *Propositiones*.

#### **4.2 Hypotheses, assumptions and problems in geometric proofs**

In the history of mathematics, problems seem to be by large the main theoretical concern, being later displaced by the assertion-theoremhood perspective. Since XIXth logic has become also driven by the assertion-theoremhood perspective. A valid inference is defned as a necessary relation holding among premiss-assertions and a conclusion-assertion. Nonetheless, when suppositions enter the picture things become delicate. Assumptions are sometimes conceived as a special kind of assertion for which we lack a proof and whose provability is then supposed.

In the context of construction problems, there is an exact correlate of assumptions conceived in the way just described. *Propositio* I.22 is a construction problem whose solution depends on supposing that a proving problem has been solved: *to construct a triangle from three straight-lines which are equal to three given. It is necessary for two taken together in any [way] to be greater than the remaining, . . .*. Thus, not any trio of straight-lines fts the restriction. What is here being supposed is that a proving problem has been solved: the addition of any two straight-lines must be greater than the third remaining. Indeed, the solution described in the *kataskeue* of I.22 might not work properly if the straight-lines being used were not in the mentioned relation. Thus, here is an example of an assumption in the mould described, that is, the supposition of having a solution to a proving problem as the starting point for the resolution of I.22.

Nonetheless, there are many other cases in which a certain condition is supposed to hold without supposing that a proving problem has been solved. Take *Propositio* I.15: *if two straight-lines cut one another then they make the vertically opposite angles equal to one another.* The resolution of this proving problem starts with a simpler supposition: that they cut one another. It seems wrong to understand it as an assumption. The condition of cutting one another is the mere description of a possible situation. This situation could be produced in a myriad of diferent ways, it does not matter which. It even might be the case that we do not know how the straight-lines were produced. What matters is if we take the drawings as straight-lines and we suppose them to fall under the condition described, or not. The diagram in Figure 4 that comes together with the *Propositio* is just an exemplifcation of such a situation and, of course, the reader should suppose that it falls under the condition stated. In this sense, it also does not matter if the straight-lines drawn are not perfect straight-lines. It is enough to suppose that they are, in order to understand the proof establishing the equality of angles.

The solution for I.15 starts by noticing that the situation of cutting one another falls under the condition of *Propositio* I.13 which also describes a situation: *if a straight-line stood on a(nother) straight-line makes angles, it will certainly either make two right-angles, or (angles whose sum is) equal to two right-angles.* In both cases, no assertion is being made in the condition. The situations are merely considered and one is related to the other. In I.15, under the supposition that the situation is of cutting one another, it follows that it should also be a situation of standing on another, thus opening the way for using I.13 in the solution for I.15, as already described previously. Observe that there is no Defnition, Common Notion, or Postulate, guaranteeing that any situation of cutting one another falls under the situation of standing on another. The comprehension of such a link seems to depend on perception and on intelect.

Concerning assumptions and hypotheses, it seems important to adopt a distinction. The concept of *assumption* should be reserved for those cases where either a proving problem is supposed to be solved or a construction problem is supposed to be solved. Assumptions are then hypotheses of a specifc nature: those where we suppose the validity or the possession of a solution for a problem, I.22 being an example.

The concept of *hypothesis* is more general than that of assumption. There are cases where a situation is being supposed but this cannot be assimilated to the supposition that a proving problem or a construction problem is solvable. We claim that *Propositio* I.15 illustrates it. Hypotheses in a large sense are the supposition that a given situation falls under certain putative identifying conditions.

There are many examples of hypotheses in Euclid's text. For example, the proof of correctness for the solution of fnding the center of the circle in *Propositio* III.1 starts by a counterfactual hypothesis: *. . . I say that (point) F is the center of the [circle] ABC. For (if) not then, if possible, let G (be the center of the circle), . . .*. This hypothesis is the starting point of a *reductio* reasoning. That there should be a center diferent of in the circle just describes a situation that has been deduced from the

counterfactual hypothesis and the fact that any circle has a center. The hypothesis does not seem an assumption, since this would be tantamount to suppose the possession of a construction showing that is not the center of the circle or, at least, showing that a diferent is the center of the circle. But what sense has such an assumption since the solution to the construction problem has just determined as the center? The correctness-proof for being the center seems to be in fact a proof of the statement that no other point inside the circle can be the center, established by a *reductio* proof.

Most of the hypotheses in Euclidean Geometry seem to be of this more general kind, not assumptions. If this is indeed the case, then the proving problem in which they appear involves a conditional proposition. The point certainly deserves a deeper investigation, but we leave it for future work.

#### **5 Conclusion**

Contrary to Kolmogorov's expectation, the calculus of problems has not become a substantial part of contemporary logic courses. Neither his problem interpretation has become widespreadly known in the community. If the above appreciation of both subjects is faithful, then we have reasons to regret that his expectations were deceived. Anyway, his work has been seminal for the development of Intuitionistic Type Theory, as stressed in Coquand (2007).

The above investigation examined the problem interpretation of intuitionistic logic and advanced Reduction Semantics as a way to further elucidate the conceptual structure in this interpretation, with some minor adjustments. Reduction Semantics brings together two main strategies for problem solving, thus giving the basics of general solution schemes. These strategies are reduction among problems and decomposition of problems. The reduction relation has been assumed as the basic semantical relation over which all patterns of problem composition-decomposition are characterized. These patterns happen to be the old well known logical constants, as Kolmogorov seems to have anticipated.

Reduction Semantics was also employedin an epistemological analysis of Euclidean Geometry. Besides construction problems, *Propositiones* like I.15 and many others are seen as proving problems. Again, Kolmogorov seems to have fully realized it, opening the door this way for considering all Euclid's *Propositiones* as problems by unifying construction problems and proving problems in the same approach.

Two points deserve to be stressed from the above investigation. First, intuitionistic logic becomes the core of a general theory of problems if we accept Kolmogorov's thesis that this logic is a calculus of problems. Second, this perspective leaves the way open to extend the usual set of intuitionist logical constants. The before-after conjunction was pointed as an example of a constant that cannot be defned employing the other constants. In the solution of any proving problem it allows one to keep trace of the reasoning.

What has been until now called proofs in Euclid's *Elements* emerge as the communications of a problem resolution. These must be seen as the transmission

of an algorithm since virtually all *Propositiones* may be treated as problems, in view of Section 4.1. Diagrams just accompany the communication of the algorithms. They exemplify the objects being dealt with in the algorithm, similar to the role of syntactical data in the tape of a Turing machine, according to de Campos Sanz (2021).

Finally, the epistemological analysis of Euclidean Geometry has noticed that hypotheses are used in distinct roles inside *Propositiones*, the most general being that of identifying criteria for situations. They also appear as the supposition of having a solution. For this last case the word "assumption" has been reserved in order to distinguish it from other uses. These other uses include the case of counterfactual hypotheses, like that in the correctness proof of III.1. The existence of counterfactual hypotheses in Euclid's *Elements* highlights a noticeable fact. Historically, the language of mathematics is more rich than the current regimented languages with which logicians are used to work.

**Acknowledgements** We thank the anonymous referees and H. Oliveira for their valuable suggestions.

#### **References**


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Disjunctive Syllogism without** *Ex falso*

Luiz Carlos Pereira, Edward Hermann Haeusler and Victor Nascimento

**Abstract** The relation between *ex falso* and *disjunctive syllogism*, or even the justifcation of *ex falso* based on disjunctive syllogism, is an old topic in the history of logic. This old topic reappears in contemporary logic since the introduction of *minimal logic* by Johansson. The disjunctive syllogism seems to be part of our general non-problematic inferential practices and superfcially it does not seem to be related to or to depend on our acceptance of the frequently disputable *ex falso* rule. We know that the acceptance of the *ex falso* is a sufcient condition for the acceptance of the disjunctive syllogism, but the interesting question is: is the *ex falso* a necessary condition for the acceptance of the disjunctive syllogism? The aim of the present paper is to discuss some possible ways to defne systems that combines the preservation of the disjunctive syllogism with the rejection of the *ex falso*.

### **1 Introduction**

The relation between *ex falso* and *disjunctive syllogism*, or even the justifcation of *ex falso* based on disjunctive syllogism, is an old topic in the History of Logic. This old topic reappears in contemporary logic since the introduction of *minimal logic* by Johansson. The disjunctive syllogism seems to be part of our general non-problematic inferential practices and superfcially it does not seem to be related to or to depend on our acceptance of the *ex falso* rule; on the other hand, the general validity of the *ex falso* has been subjected to dispute. We know that the acceptance of the *ex falso* is

© The Author(s) 2024 193 T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_6

Luiz Carlos Pereira

Filosofa, PUC-Rio/UERJ/CNPq, Rio de Janeiro, Brazil, e-mail: luiz@inf.puc-rio.br

Edward Hermann Haeusler Informatica, PUC-Rio, Rio de Janeiro, Brazil, e-mail: hermann@inf.puc-rio.br

Victor Nascimento Filosofa, UERJ, Rio de Janeiro, Brazil, e-mail: victorluisbn@gmail.com

a sufcient condition for the acceptance of the disjunctive syllogism, as the following simple derivation in an intuitionistic natural deduction system shows:

$$\begin{array}{c c c} & \frac{[B]^2}{A} \neg \neg B\\ \frac{(A \lor B)}{A} \frac{\bot}{A} \end{array} \neg \exists \begin{array}{c c} \\ \hline A \\ \hline \end{array} \bot$$

The interesting question is: is the *ex falso* a necessary condition for the acceptance of the disjunctive syllogism?

As it was said, the relation between *ex falso* and the disjunctive syllogism has a long history. A form of the disjunctive syllogism appears in Stoic Logic1 as the ffth type of *undemonstrated* argument, "an argument which, having an exclusive disjunction and the contradictory of one of the disjuncts as premises, infers the other disjunct as its conclusion"2. Diogenes Laertius, in *Lives of Eminent Philosophers* (VII, 49), gives the following example:

#### Either it is day or it is night. It is not night. Therefore, it is day.

A medieval argument from the 12th century, attributed to William of Soissons, shows how to derive the *ex falso* from the disjunctive syllogism and other "nonproblematic" rules. The argument in natural language is3:

I wonder that certain men oppose the thesis that from a per se impossibility anything whatsoever follows . . .. For doesn't it follow that if Socrates is a man and not a man, then Socrates is a man, but if Socrates is a man, then Socrates is man or a stone. Therefore, if Socrates is a man and not a man, then Socrates is a man or a stone. But if Socrates is a man and Socrates is not a man, then Socrates is not a man. Therefore, if Socrates is a man and Socrates is not a man, then Socrates is a stone.

We can reconstruct this argument axiomatically4 as:


<sup>1</sup> It is worth noticing that, while stoic disjunction is exclusive, all the new systems examined in this paper use inclusive disjunctions. Even though changes are promoted in elimination rules, we are always allowed to use the standard rules of conjunction elimination and disjunction introduction to show that ∧ ⊢ ∨ .

<sup>2</sup> Benson Mates (1953), p. 73.

<sup>3</sup> See Martin (1986), p. 571.

<sup>4</sup> We prefer the axiomatic style here as it looks closer to the text.

This argument is considered a precursor to the argument known as *Lewis' argument*5:

$$\text{(l)}\tag{1}$$

$$\text{Assume } p \sim p.$$

$$(\text{2)}\tag{1}$$

If is true and is false, then is true.

$$(\mathfrak{J})\tag{3}$$

If is true and is false, then is false.

(4) (2) . J . ∨

If, by (2), is true, then at least one of the two, and , is true.

(3) . (4): J .

If, by (3), is false, and by (4), at least

one of the two, and , is true; then must be true.

We can also easily show that the disjunctive syllogism axiom implies the *ex falso* theorem:

$$\frac{\frac{\left[\begin{smallmatrix} B \end{smallmatrix}\right]^{1}}{\left(\begin{smallmatrix} A \vee B \end{smallmatrix}\right)} & \frac{\left[\begin{smallmatrix} \neg B \end{smallmatrix}\right]^{2}}{\left[\begin{smallmatrix} (A \vee B) \wedge \neg B \end{smallmatrix}\right]} & \left(\left(\begin{smallmatrix} (A \vee B) \wedge \neg B \end{smallmatrix}\right) \to A\right) \\ \hline & \frac{A}{\left(B \to A\right)} \,^{1}$$
 
$$\frac{\overline{\left(B \to A\right)}^{1}}{\left(\neg B \to \left(B \to A\right)\right)} \,^{2}$$

And the same result can be obtained if we add the *disjunctive-syllogism rule* (DS)

$$\frac{(A \lor B) \qquad \neg A}{B} \text{ \(\square\)}$$

to minimal logic6

$$\frac{\frac{\left[\begin{smallmatrix} B \end{smallmatrix}\right]^{\mathsf{I}}}{\left(A \vee B\right)}}{\frac{\left(A \vee B\right)}{\left(B \rightarrow A\right)} \rightarrow \mathsf{I}} \text{DS}$$
 
$$\frac{\frac{A}{\left(B \rightarrow A\right)} \rightarrow \mathsf{I}}{\left(\neg B \rightarrow \left(B \rightarrow A\right)\right)} \rightarrow \mathsf{I}\ 2\text{S}$$

which allows a simple reconstruction of Soissons' argument in natural deduction:

6 Rodolfo Ertola-Biraben called our attention to the fact that it is enough to add the following particular case DS¬ of DS

$$\begin{array}{cc} (A \lor \neg A) & \neg \neg A\\ \hline A & \end{array} \text{DS}\_{-}$$

in order to obtain the full power of *ex-falso*.

<sup>5</sup> See Lewis and Langford (1959, p. 250).

$$\frac{\frac{(A \land \neg A)}{A}}{\frac{(A \lor B)}{B}} \quad \frac{(A \land \neg A)}{\neg A} \text{ } \text{os}$$

It is interesting to observe that all these arguments and proofs are *not normal* in the proof-theoretical sense, as some occurrences of disjunctive formulas are both the conclusion of an introduction rule and the *major premise* of an application of the disjunctive syllogism, which has the shape of an elimination rule.

But are we really committed to the *ex falso* if we accept the disjunctive syllogism? Is it really necessary to resort to the *ex falso* in order to justify the *disjunctive syllogism*? Could we not try some sort of *admissibility argument* to justify the disjunctive syllogism?

#### **2 An admissibility argument in minimal logic**

An *admissibility* strategy was considered by Tim van der Molen in the paper "The Johansson/Heyting letters and the birth of minimal logic". In the very beginning of the paper we fnd the following interesting passage:

The provability of Formula 4.41 [( ( (∧ ¬) ∨ ) → )] in minimal logic is a desideratum because it stems from the disjunction property. The disjunction property is a property shared by all the usual intuitionistic formal systems. It states that if we can produce a proof of (∨ ), then we can also produce a proof of or a proof of . So, if ( (∧ ¬) ∨ ) (the antecedent of 4.41) has been proved, then, by the disjunction property, (∧ ¬) is provable or is. In a consistent system like minimal logic ( ∧ ¬) is not provable. Therefore, (the consequent of 4.41) must be provable. This indicates that Formula 4.41 should hold in minimal logic. (van der Molen, 2016, p. 2)

The argument used by van der Molen has the form of an *admissibility argument*: in order to show that the rule

$$\begin{array}{c c c} A\_1 & \dots & A\_n \\ \hline & B \\ \hline \end{array}$$

is admissible, we show that if ⊢ 1, . . . , ⊢ , then ⊢ . If we try to apply this kind of *admissibility argument*7 to the disjunctive syllogism we obtain:

<sup>7</sup> An alternative admissibility argument could be obtained by means of the so-called Dummett's fundamental assumption, according to which every proof (i.e., closed derivation) in intuitionistic logic — and *a fortiori* in minimal logic — can be reduced to a canonical proof (i.e., a closed derivation using an introduction rule in the last step). Both the disjunctive property and the consistency of minimal logic are immediate corollaries of applying the fundamental assumptions to ⊢ ∨ and ⊢ ⊥ (due to the shape of the introduction rule for disjunction and to the absence of introduction rules for ⊥). Then, since we must have either a proof of or of and (by assumption) we have a proof of ¬, the fact that a proof of could then lead us to a proof of ⊥ and that it is impossible to obtain a proof of ⊥ allows us to use a meta-application of the disjunctive syllogism and conclude ⊢ . (We would like to thank one of the anonymous reviewers for drawing our attention to this formulation of the admissibility argument).


Could then an intuitionist accept the disjunctive syllogism without accepting the general validity of *ex falso*? What would be Brouwer's own position concerning the *ex falso* and the disjunctive syllogism? According to van Atten, Brouwer would reject the irrestricted validity of the *ex falso*:

In his dissertation from 1907, Brouwer gave an account of the hypothetical judgement that served him all his life. On that account, hypothetical judgements may in certain cases have false antecedents, but there is no justifcation of the general principle Ex Falso Sequitur Quodlibet. Neither is the familiar derivation of Ex Falso using the disjunctive syllogism acceptable on Brouwer's view of logic. A systematic conclusion, then, is that Brouwer's logic is a relevance logic. (van Atten, 2009, p. 123)

But again according to van Atten, a kind of "admissibility argument" in defense of the disjunctive syllogism can be attributed to Brouwer:

The application of the disjunctive syllogism is not problematic either. For if ∨ is a description that applies to a mathematical construction,this means that we have a mathematical method that, when carried out, will show that the description applies, or that the description applies; a proof of ¬ then simply tells us that the outcome of that method will be a proof of . But then we also know that we would have obtained as a description of the mathematical construction in question if no independent proof of ¬ had been available to us. The disjunctive syllogism, then, accompanies the mathematical operation of leaving the construction described by as is. (van Atten, 2009, p. 124)

From the way the argument is formulated, it is obvious that it seems circular: the last step of the argument is an explicit application of the very rule we are trying to justify, to wit, the *disjunctive syllogism*, even if only a *meta-application*! But is it necessary to understand this *meta*-application of disjunctive syllogism as dependent on a previous acceptance of ex falso? Let us consider the following scenario where we explore a comparison between our argument paths and trails we can follow in a promenade.

#### **3 An informal account**

Suppose that John is hiking in a forest and that at some point of the trail he fnds a bifurcation point marked ∨ . From this point, John could take the path marked by or the path marked by . But assume now that there is a sign (an extra information) ¬ that indicates that the path will lead to a dead-end (the ⊥). In this case, the only path open to John is the path marked . This situation can be graphically represented as:

The sign ⊥ here indicates that John *can no longer go* along path . In a certain sense, this scenario reminds an interesting passage in the third chapter of Brouwer's thesis where he says:

'But', the logician will retort, 'it might have happened that in the course of these reasonings a contradiction turned up between the newly deduced relations and those that had been kept in store. This contradiction, to be sure, will be observed as a logical fgure, and this observation will be based upon the principium contradictionis.' To this I can reply: 'The words of your mathematical demonstration merely accompany a mathematical construction that is efected without words. At the point where you enounce the *contradiction*, I simply perceive that the construction no longer *goes*, that the required structure cannot be imbedded in the given basic structure.' And when I make this observation, I do no think of a principium contradictionis. (Brouwer, 1975, pp. 72–73)

The idea is that the traveller, as Brouwer says, *simply perceives that he can no longer consider path , when he fnds the contradiction*. It is true that the *ex falso* may be "secretly" used in the process, but our frst impression is that this scenario would be acceptable to a *minimal logician*!

Just another point: this scenario also reminds Gentzen's interesting remark on the form of disjunction elimination. Gentzen says:

In this example the tree form must appear somewhat artifcial since it does not bring out the fact that it is after the enunciation of ∨ ( ∨ ) that we distinguish the cases , and . (Gentzen's frst example (1.1) on page 79 of Gentzen, 1969)

The point of this remark is that the fact that disjunction elimination has the form it has in the usual natural deduction systems is a kind of *artifcial* adaptation to the general tree-form of derivations, but that, truly, the assumptions discharged come after the major premiss, as we do in some multiple-conclusion versions of natural deduction, as the following fgure shows:

But if we consider a disjunction as a *branching point*, we need some kind of *synchronization mechanism* to bring paths together again!

The point now is that the case of the disjunctive syllogism seems to require a new kind of *synchronization mechanism*, as the following fgure shows:

Maybe this is just an extravagant idea, but now what seemed to be an application of a disputable rule, the *ex falso*, is just a kind of *synchronization mechanism* required by disjunction. This idea applied to the disjunctive syllogism yields:

But as we shall see, this representation is not free of problems. In a certain sense, this representation suggests that we could go from the *end-point* ⊥ to and this path may hide an application of *ex falso*. Maybe a more faithful representation would be

But this representation would inevitably leave us in the realm of multiple-conclusion systems.

#### **4 The system M**∨

If we do not want to go *multiple-conclusion*, we could try to defne a set of new disjunction eliminations (∨⊥-eliminations) as follows8:

1. ∨⊥-elimination-1


2. ∨⊥-elimination-2


3. ∨⊥-elimination-3

$$
\begin{array}{ccc}
\Gamma & [A]^m & [B]^n \\
\Pi & \Pi\_1 & \Pi\_2 \\
\hline
A \lor B & \bot & C \\
\hline
C & & \\
\end{array}\_{m,n}
$$

Let us the consider the natural deduction system M<sup>∨</sup> that is obtained from the propositional part of the Gentzen-Prawitz natural deduction system M for minimal logic through the replacement of the usual disjunction-elimination rule by this new set of disjunction eliminations (∨⊥-elimination-1, ∨⊥-elimination-2 and ∨⊥-elimination-3). It is clear that the system M is a proper subsystem of the new system M∨ and that M∨ is a subsystem of the propositional part of the Gentzen-Prawitz natural deduction system for intuitionistic logic, but is M<sup>∨</sup> a proper subsystem of I? Is it possible to prove the full power of *ex falso*, to prove ( → (¬ → ))), in M∨? If it is not possible, then the system M∨ could be a good candidate to be an *intermediate* system, lying between the minimal system M and the intuitionistic system I. But consider now the following simple derivations:

$$\begin{array}{c c c} \frac{[A]^2}{(A \lor B)} & \frac{[A]^3}{\bot} & \frac{[\neg A]^1}{} \\ \hline \hline & \bot & \stackrel{B}{} & [B]^4 \\ \hline & \frac{B}{(\neg A \to B)} \to\_\text{I} & 1 \\ \hline & (A \to (\neg A \to B)) & \to\_\text{I} & 2 \\ \end{array}$$

8 This modifcation of the disjunction-elimination rule was frst proposed by Neil Tennant (1979).

$$\frac{\frac{[(B \land E)]^4}{B}}{\frac{B}{(A \lor B)}} \quad \frac{\frac{[(A \lor B)]^3}{}}{\frac{A}{((A \lor B) \to A)}} \; ^3 \mathbf{1}\_{1,2}$$

$$\frac{\frac{A}{(B \land E)} \; ^4 \mathbf{1}}{\frac{A}{(B \land E) \to A} \; ^4}$$

The frst derivation is a correct non-normal proof of ( → (¬ → )) in M<sup>∨</sup> using the detour ∨; the second example is a correct derivation of {¬} ⊢ ( (∧) → ) using the *detour* ( ( ∨ ) → ). How can we avoid these problematic derivations in M∨?

The system M∨ is clearly related to the intuitionistic relevant system IR defned by Tennant (1987). He recognizes that without further restrictions, the *ex falso* would be derivable, and he considers the following derivation:

$$\begin{array}{ccccc} \frac{A}{(A \lor B)} \lor\_{I\_{\parallel}} & \frac{[A]^{\parallel} \quad \neg A}{\bot} & \neg\_{\mathbf{E}} & [B]^{2} \\ \hline \hline & B & & \\ \hline \end{array} \begin{array}{c} \neg\_{\mathbf{E}} \\ \hline \end{array}\_{\vee\_{\mathbf{E}} 1, 2.}$$

As we saw, Brouwer would have nothing against the use of the disjunctive syllogism in the derivation above; his qualms would be related to the *composition of derivations*:

The problem is rather with the *composition* of these two inferences. The frst inference requires that the mathematical construction being described is one for ⊥; the second that it is one for . As in general and ⊥ will not be equivalent descriptions, there is no general guarantee that when ⊥ describes a mathematical construction, describes it as well. This means that there is no guarantee that the linguistic fgures in Lewis' argument accompany a mathematical procedure. (van Atten, 2009, p. 124)

What are "these two inferences" to which van Atten is referring? The recognition that there is a *composition problem* and that some restriction on the composition of derivations is needed is exactly what Tennant does: in any application of an elimination rule, the major premiss of cannot be the conclusion of an introduction rule. The derivations

 ( ∨ ) ( ∨ ) [] <sup>1</sup> ¬ ⊥ [] 2 1, 2 

are correct, but the derivation (the result of the *composition*)

$$\begin{array}{c c c} A & \quad \frac{[A]^\top}{\bot} & \quad \neg A \\ \hline & B & \end{array} \quad \begin{array}{c c} [B]^\top & \quad \neg A \\ \hline & B \end{array}$$

is not.

We can certainly use Tennant's idea (forgetting everything about *relevance*) and impose *normality* by stipulation: only *normal derivations* are accepted as legitimate derivations. Obviously the system M<sup>∨</sup> is Tennant's system IR without the *relevance*restrictions. As in the case of IR, we can say that:


It is true one could say that to impose *normality* is a too high price to pay in order to preserve the disjunctive syllogism without preserving the *ex falso*. In order to have a better understanding of what is happening with *detours* of the form ( ∨ ) in the derivation above, let us go back to our initial scenario where John is hiking in a trail. Suppose that John is hiking with a map and that John arrives at the same bifurcation point marked ( ∨ ). Let us suppose that path with some extra information leads to point and that path together with extra-information leads to a point , and that from the points and we can go to point . We could try to represent this situation with the following fgure:

In the case of classical logic, if we are at point , we can access point, point and all points in Γ: no *visibility/accessibility restrictions*. But in case of intuitionistic logic the situation is completely diferent: after the bifurcation point, *visibility/accessibility* restrictions are required. A more faithful representation of the situation is as follows:

According to this new picture, at point we have only access to what is inside its box (and the same holds for point with respect to its box). In the next section we defne a new system whose aim is to incorporate these *visibility/accessibility* restrictions.

#### **5 The system M**⊥

Let the natural deduction system M<sup>⊥</sup> be obtained from the propositional part of the Gentzen-Prawitz natural deduction system M for minimal logic by the replacement of the usual disjunction elimination by the following new set of disjunction-elimination rules:

1. ∨⊥-elimination-1

$$
\begin{array}{ccc}
\Gamma & [A]^m & [B]^n \\
\Pi & \Pi\_1 & \Pi\_2 \\
A \lor B & C & C \\
\hline
C & & \\
\end{array}\_{\vee\_{\perp\_{\mathbb{E}\to 1}}m,n}
$$

2. ∨⊥-elimination-2

$$
\begin{array}{ccccc}
\Gamma^\* & [A]^m & [B]^n & \Gamma\_2^\* \\
\Pi & \Pi\_1 & \Pi\_2 & \\
\hline
A \vee B & C & \bot \\
\hline
C & & \\
\end{array} \vee\_{\perp\_{\mathbb{E}\to 2}} m, n, \Gamma^\*, \Gamma\_2^\*
$$

3. ∨⊥-elimination-3

$$
\begin{array}{ccc}
\Gamma^\* & [A]^m & \Gamma\_1^\* & [B]^n \\
\Pi & \Pi\_1 & \Pi\_2 \\
\hline
A \vee B & \bot & C \\
\hline
C & & \\
\end{array}\_{\vee\_{\perp\_{\mathbb{E}\mathbb{Z}}} \mid m, n, \Gamma^\*, \Gamma\_1^\*
$$

The notation Γ ∗ and Γ ∗ ( = 1, 2) indicates that the assumptions in Γ and in Γ ( = 1, 2) are *frozen*, i.e., that they cannot be discharged below the conclusion of the application of the corresponding ∨⊥-elimination.9

The non-normal derivation of ( → (¬ → )) obtained in M<sup>∨</sup> is clearly not a correct derivation in M⊥: the application of disjunction elimination that was used is an application of ∨-elimination-2 and the restriction demanded by the new ∨⊥-elimination-2 rule is not satisfed in , since the hypothesis ¬ is discharged below the conclusion of . The restrictions on Γ forbid the second problematic example given above. Although these problematic cases are not theorems of M⊥, we still have a non-normal derivation of {¬ ∗2 , ∗<sup>1</sup> } ⊢ !

<sup>9</sup> The intuitionistic multiple succedent sequent calculus FIL defned in de Paiva and Pereira (2005) incorporates this idea by means of devices that control dependency relations between formulas in the antecedent and formulas in the succedent of a sequent.

$$\begin{array}{cccc} \frac{A^{\*\_1}}{(A \lor B)} & \frac{[A]^1}{\bot} & \neg A^{\*\_2} \\ \hline & B & & \\ \hline \end{array} \quad [B]^2 \begin{array}{c} \\ \hline \vee\_{\perp\_{\to 3}} \ 1, \ 2, \ A^{\*\_1}, \ \neg A^{\*\_2} \\ \hline \end{array}$$

In order to avoid these problematic derivability relations, let us examine the normalization problem for M⊥.

#### **6 Normalization for M**⊥

Let us assume now that our hiker when he arrives at the bifurcation point ( ∨ ) his map indicates that he should take path . This situation can be pictured as:

After taking the path marked by , John's *promenade* looks as follows:

If John had found the indication to take path , we would have the following picture:

Γ

And after taking the path , the situation is:

The new ∨<sup>⊥</sup> reductions corresponding to these fgures are:

1. A derivation of the form

$$
\begin{array}{ccc}
\Gamma & & [A]^m & [B]^n \\
\hline
A & & \Pi\_1 & \Pi\_2 \\
\hline
A \vee B & & C & C\_2 \\
\hline
C & & & \\
\Pi\_3 & & & \\
\end{array}\_{\vee\_{\pm \mathbb{E} - 1} m, n}
$$

where <sup>2</sup> is either or ⊥, reduces (as usual) to

$$\begin{aligned} \Gamma\\ \Pi\\ [A] \\ \Pi\_1\\ [C] \\ \Pi\_3 \end{aligned}$$

2. A derivation of the form

$$
\begin{array}{ccc}
\Gamma & & [A]^m & [B]^m \\
\hline
\Pi & & \Pi\_1 & \Pi\_2 \\
\hline
A \vee B & & C\_1 & C \\
\hline
C & & & \\
\Pi\_3 & & & \\
\end{array}\_{\begin{array}{c}\Pi\_3 \end{array}}
$$

where <sup>1</sup> is either or ⊥, reduces (as usual) to

Γ Π [] Π2 [] Π3

3. A derivation of the form

$$\begin{array}{ccccc} \Gamma^\* & & [A]^m & [B]^n & & \Gamma^\* \\ \Pi & & \Pi\_1 & \Pi\_2 & & \Pi \\ \hline \overline{A \lor B} & C & \bot & & \text{reduced to} & [B] \\ \hline & C & & & & \Pi\_2 \\ & & \Pi\_3 & & & & \bot \\ \end{array}$$

4. A derivation of the form

$$
\begin{array}{ccc}
\Gamma^\* & & [A]^m & [B]^n & & \Gamma^\* \\
\hline
A & & \Pi\_1 & \Pi\_2 & & \Pi \\
\hline
\overline{A \lor B} & \bot & & C & \text{reduced to} \\
\hline
C & & & & \Pi\_1 \\
& \Pi\_3 & & & & \bot
\end{array}
\qquad
\text{reduced to} \qquad
\begin{array}{ccc}
\Gamma^\* & & & \Pi \\
\Pi & & & \Pi \\
\hline
C & & & \Pi\_1 \\
& & & \Pi\_1 \\
& & & \bot
\end{array}
$$

We could use here the strategy we used with respect to M∨ and impose *normality* by stipulation: only *normal derivations* are accepted as legitimate derivations.

But we can also use a new strategy that was used by Prawitz in *Natural Deduction*. In the appendix A, on set theory, Prawitz introduces the notion of *quasi-derivation*:

F [] is to be the system that is obtained from I [′ ] by the addition of the - and -rule. A *quasi-deduction* in one of these systems is defned in the same way as a deduction was defned for Gentzen-type systems in general (Chapter I, §2). A deduction is then defned as a quasi-deduction that is normal. (Prawitz, 1965, pp. 94–95)

We could use this idea and defne:

**Defnition 6.1** A *quasi-derivation* in M<sup>⊥</sup> is a derivation as we usually defne it. A *derivation* in M<sup>⊥</sup> is a quasi-derivation that is *normal*.

We can now formulate the normalization theorem for M⊥ as follows:

**Theorem 6.2** *Let* Π *be a quasi-derivation of* Γ*,* Δ <sup>∗</sup> ⊢ *. Then, either* Π *reduces to a derivation* Π ′ *of* Γ*,* Δ <sup>∗</sup> ⊢ *or* Π *reduces to a derivation* Π ′′ *of* Γ*,* Δ <sup>∗</sup> ⊢ ⊥*.*

The new normalization theorem establishes that a quasi-derivation will always take us either to a normal derivation of the same conclusion or to a normal derivation of ⊥10. We can think of quasi-derivations in M<sup>⊥</sup> as *deduction-maps* that may contain *detours*. Once we follow the map eliminating the detours (the *normalization guide*), we will either reach the marked goal or we will reach a *dead-end*.

*Important remark:* This idea of *deduction maps* cannot be applied to the system M∨, since the system M∨ does not satisfy the new normalization theorem. The following simple derivation is a counter-example to the new normalization theorem: This derivation

$$\frac{\frac{[A]^3}{(A \lor B)} \quad \frac{[A]^1 \quad \neg A}{\bot} \quad \quad [B]^2}{\frac{B}{(A \to B)} \to\_1 3} \quad \text{\tiny{\tiny.} }\_{\vdash\_{\bot \text{-} 3}} 1, 2$$

reduces to

$$\frac{A\_{\perp} \to A}{\perp}$$

We immediately see that the number of open assumptions increases after the reduction ( was not open before the reduction).

<sup>10</sup> We can easily see that in some cases a quasi-derivation can be transformed into diferent derivations of ⊥.

#### **7 Conclusion**

We have been examining two extensions of the propositional part of the Gentzen-Prawitz natural deduction system M for minimal logic, the systems M∨ and M⊥, that satisfy *in a certain way* the disjunctive syllogism but that do not satisfy the *ex falso*.

The system M<sup>∨</sup> uses the rules ∨-elimination-1, ∨-elimination-2, and ∨-elimination-3, and *normality* is imposed by construction. In the system M∨ we have the following results:

1. By means of the new rules for disjunction, we can easily show that

$$\{(A \lor B), \neg B\} \vdash\_{\mathsf{M}\_{\vee}} A...$$

2. Given that there is no restriction on the deduction theorem in M∨, we also have

⊢<sup>M</sup><sup>∨</sup> ( ( ( ∨ ) ∧ ¬) → ).

3. The imposition of *normality by contruction* guarantees that

$$\forall\_{\mathsf{M}\_{\vee}} (\neg A \to (A \to B)).$$


The system M<sup>⊥</sup> uses the rules∨⊥-elimination-1,∨⊥-elimination-2,and∨⊥-elimination-3, and these rules impose a more strict control on *dependency relations* between assumptions and derived formulas. *Normality* is not imposed by construction, but it is used to defne derivations: M<sup>⊥</sup> works with the concept of *quasi-derivations* and defnes derivations as quasi-derivations that are normal. The normalization theorem for M<sup>⊥</sup> guarantees that every quasi-derivation Π of Γ, Δ <sup>∗</sup> ⊢ either reduces to a derivation Π ′ of Γ, Δ <sup>∗</sup> ⊢ or Π reduces to a derivation Π ′′ of Γ, Δ <sup>∗</sup> ⊢ ⊥.

In the system M⊥ we have the following results:


Obviously there is still a lot of work to be done: (1) a more detailed comparison between the systems M<sup>∨</sup> and M⊥; (2) a more detailed comparison with Tennant's systems11, 12; (3) a deeper exploration of the idea of "freezing" hypotheses; and a more in-depth analysis of the proof-theory of M⊥.

**Acknowledgements** The work of Peter has been a great source of inspiration for us and this is especially true with respect to this paper: its original motivation was Peter's talk on "Paradoxes, local consequence, and the role and future of deductive logic", given at the Swedish Collegium for Advanced Study (SCAS), Uppsala. We would also like to thank Dag Prawitz, Valeria de Paiva, Wagner de Campos Sanz, Evandro Gomes, Abilio Rodrigues, Rodolfo Ertola-Biraben, Bogdan Dicher, Neil Tennant and the Tübingen group for important suggestions and criticisms. And fnally, we would like to give a big thanks to the two anonymous referees for the extremely careful, detailed and generous reviews. Unfortunately, some important suggestions could not be incorporated into the revised version of our text, but they will certainly have to be taken into account in our future work. This research was supported by a CNPq project and by the projects CAPES/PRINT and CAPES/COFECUB.

#### **References**


Mates, B. (1953). *Stoic Logic*. Berkeley: University of California Press.

<sup>11</sup> Tennant (1987; 2017).

<sup>12</sup> After the text was ready and submitted, Prof. Bogdan Dicher called our attention to the many similarities between our results and those of yet another paper by Neil Tennant (1994). Both Tennant's extraction theorem and our notion of deduction maps heavily rely on procedures which are applied to derivations in order to produce either derivations with the same end formula or derivations of the absurdity constant. However, there are also many important diferences between both approaches. Tennant imposes normal form by defnition on derivations in his relevant intuitionistic logic, whereas our formulation allows derivations to contain maximal formulas which are later expunged by reduction procedures. Moreover, ours is a logic which stands strictly between minimal and intuitionistic logic, whereas the relation between Tennant's system and those logics is considerably more complex; though it allows one to prove all theorems of intuitionistic logic (even ones such as ⊢ ¬ → ( → )), it invalidates many of its deducibility relations (it can be shown that {¬, } ⊬ , for example). There also seem to be many other interesting diferences and similarities between both approaches but, unfortunately, the late discovery of this relationship efectively prevented us from properly addressing the topic, and so further comparisons will have to be left to future work.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **The Logicality of Equality**

Andrzej Indrzejczak

**Abstract** The status of the equality predicate as a logical constant is problematic. In the paper we look at the problem from the proof-theoretic standpoint and survey several ways of treating equality in formal systems of diferent sorts. In particular, we focus on the framework of sequent calculus and examine equality in the light of criteria of logicality proposed by Hacking and Došen. Both attempts were formulated in terms of sequent calculus rules, although in the case of Došen it has a nonstandard character. It will be shown that equality can be characterised in a way which satisfes Došen's criteria of logicality. In the case of Hacking's approach the fully satisfying result can be obtained only for languages with a nonempty, fnite set of predicate constants other than equality. Otherwise, cut elimination theorem fails to hold.

**Key words:** equality, logical constants, sequent calculus

#### **1 Introduction**

It is difcult to fnd serious applications of logic that do not use equality. Not only it is necessary for development of mathematical theories but it plays also an important role in philosophical applications. Yet it is problematic to show that it is a logical constant having a similar behaviour to undisputable cases like extensional connectives or quantifers. In this paper we try to look at the problem from the proof-theoretic perspective and ask if it is possible to characterize equality by means of rules satisfying some of the proposed criteria of logicality. For simplicity's sake we restrict our considerations to classical frst-order logic (FOL) although obtained results may be easily transmitted to intuitionistic logic (see concluding remarks). The criteria which will be examined with respect to equality are those proposed by Hacking (1979) and Došen (1989), and the framework for our considerations is provided by Gentzen's

Andrzej Indrzejczak

Department of Logic, University of Łódź, Poland, e-mail: andrzej.indrzejczak@flhist.uni.lodz.pl

<sup>©</sup> The Author(s) 2024 211

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29,

sequent calculus (SC). This is partly determined by the fact that both approaches to criteria of logicality were proposed in this framework, although in the case of Došen it was not a standard variant of SC. Moreover, the framework of SC seems to be particularly well suited for investigations concerning the problems of criteria for logical constants and in general for investigations in proof-theoretic semantics (see, e.g., Schroeder-Heister 2016).

It will be convenient to start our considerations with some general remarks concerning equality since the problem of its logical status begins with the proper understanding of what the equality predicate stands for. It is tacitly and commonly assumed that a binary predicate, usually symbolised as =, is introduced to formal languages as a characterization of the identity relation. In fact, the words 'identity' and 'equality' are often treated as synonymous by a majority of mathematicians and logicians (not excluding the author). Usually, it does not lead to any problems, but in the case where equality is itself an object of study, we should be more careful. Therefore, we prefer to follow here such authors like Manzano and Moreno (2017) or Kahle (2016), in keeping a strict distinction between identity and equality. Identity is a relation between objects and equality between terms. The former is a semantic relation that holds trivially only between an object and itself, whereas the latter is a syntactical relation which may hold between any terms of the language. It is natural to postulate that the equality predicate expresses in the language the identity of objects denoted by its arguments, however, there are serious problems hidden in such identifcation. First of all it can be even doubtful if identity is a genuine relation, and if so, if it should be represented by some binary predicate. Wittgenstein (1922) presented such a view in his rejection of the very symbol of equality from his language. In fact, Wittgenstein's view can be formally developed in an interesting way, as was shown by Hintikka (1956) and Wehmeier (2014).

Even if we follow a standard practice of treating identity as a relation, one must be aware that the kind of the correspondence between the equality predicate and the identity relation is not very strict. In model-theoretic terms identity is just a diagonal relation on the product of the domain of a model. But the equality predicate, as characterised in axiomatic systems of frst-order logic (see Section 3) cannot express identity only. Even in the case of a language with a fnite number of predicates, one can fnd nonstandard models in which axioms of equality do not characterise identity. It seems that the second-order logic (SOL) is better in this respect. Well, if we admit that the second-order logic is a genuine logic we can defne identity in terms of equivalence and the second-order universal quantifcation, by means of Leibniz' axiom (see Section 3). But the second-order logic is expressive enough to capture Peano's Arithmetic so only the logicist position makes this argument unproblematic. Moreover, this holds only in the standard semantics for which SOL is not complete. If we take Henkin's generalised models to regain completeness, we can again fnd models where Leibniz' axiom does not determine identity (see, e.g., Manzano, 2005).

Despite the defciency of equality as a defnition of identity (in FOL in particular) it does make sense to check if equality itself may be conceived as a logical constant and this is our aim. In Section 2 we establish the notation and recall the basic information on sequent calculi and properties of rules which are important for this task. Several ways of formalising equality in axiomatic and natural deduction systems are surveyed in Section 3. Ways of dealing with equality in sequent calculi are discussed in a separate section. In Section 5 we recall criteria of logicality formulated by Hacking and check if equality can be formalised in a way conforming to these desiderata. It appears that most of the criteria hold for our proposal but cut elimination is sensitive to the kind of language under consideration. In Section 6 we describe Došen's approach and a nonstandard structural variant of sequent calculus adapted to its realization. It seems that equality formalised in this kind of a system satisfes fully the conditions for logical constant but only if equality is not the only predicate of the language. We fnish with some remarks concerning further applications and possible generalizations of presented approach.

#### **2 Preliminaries**

The notation applied in the paper is mostly standard. , , will represent arbitrary formulae built by means of ¬, ∧, ∨, →, ∀, ∃ from atomic formulae, i.e., predicates followed by a list of terms. Following Gentzen's custom, we distinguish between bound and free occurrences, reserving , , , . . . for representing the former and , , , . . . for the latter, usually called parameters. Nothing essential depends on this distinction, although it simplifes a defnition of substitution for terms. Other terms, if any, will be constructed from function symbols of any arity. We will use , , ℎ and metalevel for their representation. Arbitrary terms will be represented as 1, 2, . . .. Predicates will also be divided into parameters (schematic symbols) and predicate constants of specifc languages determining their signature. Predicates of both categories will be represented either by , or, in the metalevel, by . Incidentally, will be used for bound (predicate) variable of the second-order and. () denotes a fomula having at least one occurrence of and [/] the result of the correct substitution of for all free occurrences of . Γ, Δ, Σ, . . . represent fnite multisets of formulae. Eventually = will be used as a symbol of equality. In general, formulae of the form <sup>1</sup> = <sup>2</sup> are not counted as atomic, since = is considered as a logical constant.

Following Church's (1956) terminology we distinguish the following language variants of FOLI (FOL with identity):


Thus pure FOLI is just a schematic version of FOL with equality whereas several cases of simple applied FOLIs are specifc languages characterised by their signatures. For example the language of simple applied FOLI of set theory has two binary predicate constants: = and ∈. The cases of applied FOLI are of mixed character

since in addition to constants they admit variables/parameters. This classifcation is essential for comparison of diferent possible characterisations of equality.

As our basic sequent calculus (SC) for classical logic FOL we will use a system which is essentially Gentzen's LK but with sequents built from multisets to avoid inessential complications. We also prefer to present all two-premiss rules in the multiplicative (or with independent contexts) version in contrast to Gentzen's original rules for ∧, ∨. Again nothing essential hinges on this choice and other variants of SC can be also applied. The calculus consists of the following structural and logical rules:


where is not in Γ, Δ,

Formulae displayed in the schemata are active whereas those in (possibly empty) multisets Γ, Δ are parametric (or form the context). In particular, a unique formula in the antecedent or succedent of the conclusion is the principal formula of the respective rule application whereas active formulae in the premisses are called side-formulae. The notion of a proof is standard, i.e., a tree labelled with sequents where each leaf is an axiom and edges are regulated by the rules. The height of a proof is the number of nodes in maximal branches. For stating criteria of logicality it is important to focus on some of the characteristic features of the logical rules. First of all they are rules of introduction of a constant, either to the antecedent or to the succedent of a sequent. Moreover, using a terminology of Wansing (1999) (see also Poggiolesi, 2011), we can observe that well-behaved rules have the following properties:


Rules satisfying these properties are also called canonical by Avron (2001). In what follows we tacitly assume that candidates on rules characterising a logical constants should have these features or some reasonable generalizations of them. But these are considered only as necessary conditions, for sufciency we will examine additional requirements formulated by Hacking and Došen.

#### **3 Approaches to equality**

As we remarked our main tool will be SC but it is proftable to recall frst how equality was (and is) usually dealt with in the framework of other proof systems, in particular, in axiomatic or natural deduction systems (ND). In the philosophical considerations we can often fnd a reference to the traditional characterisation of equality due to Leibniz. This approach may be formally presented as a formula of the second-order logic (SOL) which we call LA (Leibniz Axiom):

$$
\tau\_1 = \tau\_2 \leftrightarrow \forall X (X \tau\_1 \leftrightarrow X \tau\_2) .
$$

Commonly, the left-right implication is called the principle of indiscernibility of identiticals, whereas the converse is called the principle of identity of indiscernibles1. The latter principle implies immediately refexivity of = whereas full LA is required to prove both symmetry and transitivity of = as implied by symmetry and transitivity of ↔. Note also that LA may be weakened in two senses: (a) may be restricted to atomic predicates (b) the rightmost equivalence may be replaced with implication. The restriction (b) is not independent from (a); LA is derivable from the weaker form (b) if we admit that not only predicates but complex formulae (of FOL) may be instantiated for (see the proofs provided by Read, 2004 or Parlamento and Previale, 2019). This is perhaps not in confict with the original intuitions of Leibniz since he seems to consider every context where respective terms may be exchanged salva veritate. On the other hand, in order to prove that equality is symmetric in restricted case (b) we must instantiate with equalities. But this solution shows that LA in restricted form (b) cannot be treated as a defnition of identity since it is either incomplete or circular. In the former case to obtain the full characteristics of identity

<sup>1</sup> Although some doubts may be raised against the correctness of the identifcation of these traditionally considered principles with this formula of SOL, see, e.g., Mates (1986).

we must add explicitly the condition of symmetry. Below we will show that using restricted form (b) of LA leads to other inadequacies as well.

On the other hand, restriction (a), i.e., restriction to atomic predicates as admissible instances of , has some merits in the case of simple applied languages which was noted by Quine (1970). We can change the 'defniens' into conjunction of all possible cases. For example, if our language has one unary primitive predicate constant and one binary it takes the form:

$$(\mathbf{L}\mathbf{A}') \quad \tau\_1 = \tau\_2 \leftrightarrow ((A\tau\_1 \leftrightarrow A\tau\_2) \land \forall \mathbf{x} ((R\tau\_1 \mathbf{x} \leftrightarrow R\tau\_2 \mathbf{x}) \land (R\mathbf{x}\tau\_1 \leftrightarrow R\mathbf{x}\tau\_2))).$$

We already mentioned that this is not sufcient to obtain a real defnition of identity in FOL but it can work as good stipulation of identity in the case of simple applied FOLI, i.e., languages with fnite number of predicate constants.

If we restrict our considerations to FOL, a characterisation of equality in terms of LA is of no use in the case of pure or applied versions of language but still may have some heuristic value. In particular, on the ground of Hilbert systems one may distinguish two approaches which we call algebraic and Leibnizian. In the former, equality is characterised simply as a congruence on terms so we need to state frst that it is an equivalence relation expressed by:


This is enough for simple FOLI; in the case of simple applied versions of FOLI we must add two principles of congruence for every primitive atomic predicate and term:

1. Congruence of Predicates CP:

$$\begin{aligned} \left( \forall \mathbf{x}\_1, \dots, \mathbf{x}\_n, \mathbf{y}\_1, \dots, \mathbf{y}\_n (\mathbf{x}\_1 = \mathbf{y}\_1 \land \dots \land \mathbf{x}\_n = \mathbf{y}\_n \to \\ \left( \pi^n(\mathbf{x}\_1, \dots, \mathbf{x}\_n) \to \pi^n(\mathbf{y}\_1, \dots, \mathbf{y}\_n) \right), \end{aligned} \right)$$

where is -argument predicate symbol.

2. Congruence of Terms CT:

$$\begin{aligned} \left( \forall \mathbf{x}\_1, \dots, \mathbf{x}\_n, \mathbf{y}\_1, \dots, \mathbf{y}\_n \right) \mathbf{x}\_1 = \mathbf{y}\_1 \land \dots \land \mathbf{x}\_n = \mathbf{y}\_n \to \\ \theta^n (\mathbf{x}\_1, \dots, \mathbf{x}\_n) = \theta^n (\mathbf{y}\_1, \dots, \mathbf{y}\_n), \end{aligned}$$

where is -argument function symbol.

This way of characterising equality is particularly elegant if we deal with simple applied frst-order languages having only a small number of primitive predicate or function constants. It works better than characterization via LA′ since we obtain one axiom for every predicate instead of equivalences for every -ary predicate. Moreover, if we treat equality as a primitive atomic predicate it is not necessary for symmetry and transitivity of = to be explicitly added since they are provable by means of CP.

Authors dealing with pure or applied FOL, i.e., with predicate and function parameters usually prefer the latter approach which we called Leibnizian. It also requires refexivity usually stated schematically as = for every term, and the extensionality principle:

$$\text{(EP)}\qquad\qquad\forall \text{xy} (\text{x} = \text{y} \land \varphi \{\text{z}/\text{x}\} \to \varphi \{\text{z}/\text{y}\}),$$

where is arbitrary or atomic. The latter form is simpler to formulate since there is no problem with bound variables; moreover the general form is provable in extensional FOLI. In what follows we will keep the name EP for the version with arbitrary and call the version restricted to atoms Leibniz principle LP. One should note that it encodes one direction of LA, namely indiscernibility of identicals in weaker form (b), i.e., with ↔ replaced by →. It explains why we call this approach Leibnizian. One should also note that, contrary to ordinary custom, we defned EP (LP) by means of (correct) substitution of , for some free variable . It is much more popular that it is characterised in terms of replacement:

$$\text{(EP')}\qquad\qquad\forall \text{xy} \\ (\text{x} = \text{y} \land \varphi \to \varphi \text{[x/(y)]}),$$

where [//] denotes a replacement of some (not necessarily all) occurrences of by . It is perhaps intuitively more accessible but has some formal disadvantage since replacement is not an operation. To avoid this problem some authors defne EP (LP) by means of a unique replacement:

$$\text{(EP\prime\prime)}\qquad\qquad\forall \text{xy} (\text{x} = \text{y} \land \varphi(.\dots \text{x} \dots) \to \varphi(.\dots \text{y} \dots)),$$

where only one displayed occurrence of a variable (term) is taken into account. The last formulation is also simpler for arithmetization hence applied in the works dealing with Gödel's theorems. But it should be stressed that all these forms of characterization are equivalent. In particular, any possible application of EP′ is just a series of applications of EP′′. Also SYM and TR are easily provable by any of these principles so it is not necessary to introduce them as primitive axioms.

Instead of REF we can fnd (for example in Tarski, 1941):

$$(\exists =) \qquad \qquad \qquad \exists \ge (\ge = \tau) \text{ where } \ge \text{ is not in } \tau.$$

This formula implies refexivity by EP: = ∧ = → = . In Mates (1965) the same characterization is applied but with universal closures of EP′ and (∃ =).

There are also possible approaches which dispense with refexivity, instead using just one formula which is equivalent to REF and EP. This approach is due to Wang (see Quine, 1966) and a dual axiom is due to Kalish and Montague (1957). On the ground of Hilbert system each one may be expressed by one axiom:

$$\exists \mathbf{x} (\mathbf{x} = \boldsymbol{\tau} \land \boldsymbol{\varphi}) \leftrightarrow \boldsymbol{\varphi} [\mathbf{x}/\boldsymbol{\tau}], \text{ where } \mathbf{x} \text{ is not free in } \boldsymbol{\tau}.$$

or

$$\forall \mathbf{x} (\mathbf{x} = \tau \to \varphi) \leftrightarrow \varphi \{ \mathbf{x} / \tau \}, \text{ where } \mathbf{x} \text{ is not free in } \tau.$$

In the framework of natural deduction (ND) the Leibnizian approach is prevalent although EP (LP) is usually presented as an inference rule of identity elimination:

218 Andrzej Indrzejczak

$$\begin{array}{cc} \text{(IE)} & \tau\_1 = \tau\_2, \varphi\{\mathbf{x}/\tau\_1\} \vdash \varphi\{\mathbf{x}/\tau\_2\}, \end{array}$$

whence refexivity is treated as a zero-premiss rule of identity introduction:2

$$
\mathfrak{T} \dashv \mathfrak{T} \dashv \mathfrak{T} \tag{\text{II}}
$$

Martin-Löf (1971) has shown that such a pair of rules is naturally induced for equality predicate in the general ND framework which satisfes normalization.

Kalish and Montague (1964) provide an ND system where the pair of rules is of the form:

∀( = → ) ⊢ [/], where is not free in ,(IE ′ )

$$\varphi(\Pi') \qquad \qquad \varphi\{\mathbf{x}/\tau\} \vdash \forall \mathbf{x} (\mathbf{x} = \tau \to \varphi), \text{ where } \mathbf{x} \text{ is not free in } \tau.$$

Both are clearly derived from the second axiom stated above.

There are also ND systems which operate not on formulae but on sequents of the form Γ ⊢ , where Γ is a set (multiset, sequence) of active assumptions for 3. In this setting rules for equality are formulated as follows:

$$(\Pi'') \qquad \qquad \Gamma \vdash \tau = \tau,$$

$$\text{(I\!E\prime\prime)}\qquad\text{If }\Gamma\vdash\tau\_1=\tau\_2\text{ and }\Delta\vdash\varphi\{\mathbf{x}/\tau\_1\}, \text{ then }\Gamma,\Delta\vdash\varphi\{\mathbf{x}/\tau\_2\}.$$

Finally let us point out the solution which is particularly important for our purposes. Read (2004) provides a rule of equality introduction as a proof construction rule of the form:

$$\begin{aligned} \text{(RII)} \qquad & \qquad \text{If } \Gamma, \varphi[\mathbf{x}/\tau\_1] \vdash \varphi[\mathbf{x}/\tau\_2] \text{, then } \Gamma \vdash \tau\_1 = \tau\_2, \\ \text{where } \varphi \text{ is atomic and does not occur in } \Gamma. \end{aligned}$$

This rule is not sound in standard models for FOLI, however it is sound in so called Leibnizian models (see Read, 2004) and it may be shown that this class of models can equivalently characterise FOLI. Note that in the context of simple applied languages with a fnite number of primitive predicates the corresponding result cannot be stated by means of such a rule; instead it may be stated by means of fnite number of subproofs for each predicate constant. Something similar was proposed by Więckowski (2011) who provided a rule of the form:

$$\varphi\_1(\text{WRI}) \qquad \varphi\_1[\mathbf{x}/\tau\_1] \leftrightarrow \varphi\_1[\mathbf{x}/\tau\_2], \quad \dots \text{ } \varphi\_n[\mathbf{x}/\tau\_1] \leftrightarrow \varphi\_n[\mathbf{x}/\tau\_2] \vdash \ \tau\_1 = \tau\_2.$$

Note that here we have an inference rule since instead of subproofs we have a fnite number of premisses. The drawback of this rule is that another constant, namely the

<sup>2</sup> Some authors add also inference rules corresponding to SYM and TR but of course they are derivable.

<sup>3</sup> The fact that we deal with sequents not with formulae is sometimes hidden since Γ consists not of formulae but of numbers referring to lines where assumptions were stated — see, e.g., Suppes (1957), Lemmon (1965).

equivalence connective, is present so the rule is not separate (see the previous section). It could be changed into a separate rule (i.e., with displayed equality only) by replacing each premiss [/1] ↔ [/2] with a pair of subproofs [/1] ⊢ [/2] and [/2] ⊢ [/1]. Note that it does not directly corresponds to Read's rule the latter should have an additional subproof leading from [/2] (as an assumption) to [/1]. The reasons why Read dispenses with the second subproof was explained above; WRI is based on LA (or rather LA′ ) whereas RII on LA with nonrestricted instantiation for which enables replacement of equivalences by implication. Read's solution may be criticised not only from technical but also from philosophical point of view (see, e.g., Grifths 2014, or Klev 2019) but deserves carefull examination. In what follows we will check how it works in the setting of SC.

#### **4 Equality in sequent calculi**

SC provides a framework which not only easily accomodates all approaches described so far but allows for several other solutions. An interesting thing is that in SC we can characterise equality not only by local rules but globally in the following way:

$$(\text{SUB})\ \frac{\tau\_1 = \tau\_2, \Gamma \left[ \mathbf{x} / \tau\_1 \right] \Rightarrow \Delta \left[ \mathbf{x} / \tau\_1 \right]}{\tau\_1 = \tau\_2, \Gamma \left[ \mathbf{x} / \tau\_2 \right] \Rightarrow \Delta \left[ \mathbf{x} / \tau\_2 \right]}\tag{\text{REF}} \implies \tau = \tau$$

where Γ[/] denotes a uniform substitution of by in all elements of Γ. Such solution was frst introduced by Kanger (1957) but also proposed by Wang (1960) in the version where substitution is made only in Δ; this apparently weaker version is in fact sufcient. Essentially the same solution was used among others, by Mints (1968), Došen (1989), Seligman (2001), in several variants (for example with <sup>1</sup> = <sup>2</sup> only in the conclusion). Usually (SUB) is introduced in two variants where in the second we have <sup>2</sup> = 1, but it is redundant. Similar approach was also applied by Schroeder-Heister (1994) in the formalization of the free equality investigated in the setting of logic programming.

This form of introduction of equality is global, since we can treat a sequent as expressing a whole proof at some stage. Hence to imitate the application of this rules in ND we should rewrite the whole derivation. So in fact it is global in comparison with the solution proposed in ND setting. In SC this is reduced to the operation performed on the sequent not on the whole derivation. Such an approach has obvious virtues. One may easily prove everything which is needed to show that it is adequate; we obtain a proof of LP immediately. Moreover, cut elimination holds for it (see, e.g., Seligman, 2001 for a constructive proof; the version of Kanger is just a cut-free variant of G3). It is also worth noting that it was the most infuential approach in automated theorem proving based on SC4. However, it seems that this approach is not fully convincing as a way for justifying equality as a logical constant. Even in the form proposed by Došen (see Section 6) equality is presented as something which

<sup>4</sup> Degtyarev and Voronkov (2001) present it as the only SC-based approach to equality formalization in automated deduction.

afects the whole sequent and looks like something of slightly diferent character from other logical constants which are characterised by local rules.

Other approaches are of local character and can be divided according to the possible ways of formalizing theories in the framework of SC. Negri and von Plato (2001) described four approaches to this question:


In every class we can fnd SC formalizations of equality. The frst approach may be called the "naive" since it treats SC in the same way as Hilbert system and does not refer to its specifc features. Not surprisingly such a solution obstructs the application of specifc virtues of SC; in particular cut elimination cannot be proved.

The second approach may be seen as a refnement of the frst and was already applied by Gentzen in his formalization of Peano arithmetic. Restricted to equality it leads to addition of two axiomatic (atomic) sequents of the form:

$$\Rightarrow \tau = \tau$$

$$\tau\_1 = \tau\_2, \varphi[\mathbf{x}/\tau\_1] \Rightarrow \varphi[\mathbf{x}/\tau\_2] \text{ with } \varphi \text{ atomic}$$

In Takeuti (1987) one may fnd variants of this approach and the proofs of some of its features. It is interesting that although cut elimination cannot be proved in general for such a system it may be proved in restricted form. Let us call inessential any cut in which the cut formula is an equality, otherwise cut is essential. For Takeuti's system it holds that all essential cuts are eliminable. Recently Parlamento and Previale (2019) proved an even stronger result showing that after an additional series of transformations cut can be eliminated from all proofs.

The third approach was also considered by Gentzen. Interestingly enough one can prove cut elimination for it but for any theorem of FOLI we do not obtain proofs of ⇒ but of Γ ⇒ where Γ is a collection of instances of REF and LP. Accordingly this approach does not provide an interesting tool for analysis of proofs.

There is a variant of this approach which in fact can be treated also as belonging to the last group. For each axiom an SC rule is postulated for elimination of this axiom, which is of the form:

$$\mathbf{(AE)}\tag{A2}$$

$$\frac{\varphi, \Gamma \Rightarrow \Delta}{\Gamma \Rightarrow \Delta}$$

Such formalization of cut-free SC for FOLI is considered in Gallier (1986), where can be an instance of REF, CP or CT. Also Bell and Machover (1977) apply this approach in the tableau framework where it is represented just as a rule of introduction of suitable instances of REF, CT or CP on the branch. Note that in the context of SC this solution although applied in the cut-free version is in fact equivalent to addition of the special form of cut. This follows from the result that the cut elimination theorem

is equivalent to the result showing eliminability of (AE) where is an arbitrary thesis (see Indrzejczak, 2017).

It seems that the most interesting approach, in particular for our purposes, is the last one. Gallier's solution is literaly speaking of this sort but not very satisfactory since (AE) is rather a trivial kind of rule mechanically applied to any formula. The subformula property does not hold and cut freeness is apparent, as we noted above. Generally speaking, it is not in any sense better than the frst approach. What we need is the generation of genuine rules with active formulae in premisses and conclusions and satisfying possibly some welcome proof-theoretic properties like cut elimination or a reasonable form of the subformula property. Such solution which, in some specifc form, was advocated by Negri and von Plato (2001) and applied to formalisations of several theories on the basis of SC, found many adherents. In particular, Troelstra and Schwichtenberg (1996) in the second edition of their well known textbook introduced this characterization of equality instead of the second one which was present in the frst edition. Equality is characterised by means of two rules:

$$(\text{RE})\ \frac{\tau\_1 = \tau\_2, \varphi[\mathbf{x}/\tau\_1], \varphi[\mathbf{x}/\tau\_2], \Gamma \Rightarrow \Delta}{\tau\_1 = \tau\_2, \varphi[\mathbf{x}/\tau\_1], \Gamma \Rightarrow \Delta} \qquad\qquad\qquad (\text{REP})\ \frac{\tau = \tau, \Gamma \Rightarrow \Delta}{\Gamma \Rightarrow \Delta}.$$

which are added to purely logical variant G3 . In general the specifc features of Negri and von Plato's approach are connected with the fact that active formulae are atomic and occur only on one side of sequents. We will call this variant the one-sided approach; systems in Negri and Plato (2001) are in fact left- (or antecedent-)sided but in Negri and Plato (2011) right-(or succedent-)sided systems are also considered. Rules of this kind can safely be added without destroying all results concerning admissibility of structural rules, including cut. However, the rule-based approach to the characterisation of theories may be realised in many diferent ways, not necessarily as one-sided, despite its obvious virtues.

In order to put things in a systematic way we apply the following theorem (Indrzejczak, 2018b):

**Theorem 4.1 (Rule-generation)** *For any sequent* Γ ⇒ Δ *with* Γ = {1, . . . , } *and* Δ = {1, . . . , }, ≥ 0, ≥ 0, + ≥ 1 *there are* 2 + − 1 *equivalent rules captured by the general schema:*

$$\frac{\Pi\_{1}, \Longrightarrow \Sigma\_{1}, \varphi\_{1}, \dots, \Pi\_{i} \Rightarrow \Sigma\_{i}, \varphi\_{i} \qquad \psi\_{1}, \Pi\_{i+1} \Rightarrow \Sigma\_{i+1}, \dots, \,\,\_{i}\psi\_{j}, \Pi\_{i+j} \Rightarrow \Sigma\_{i+j}}{\Gamma^{-i}, \,\Pi\_{1}, \dots, \,\,\_{\Pi\_{i}}\Pi\_{i+1}, \dots, \,\,\_{\Pi\_{i}}\Pi\_{i+j} \Rightarrow \Sigma\_{1}, \,\,\_{\dots}, \,\Sigma\_{i}, \,\,\Sigma\_{i+1}, \,\,\dots, \,\,\Sigma\_{i+j}\Delta^{-j}}$$

*where* Γ <sup>−</sup> = Γ− {1, . . . , } *and* Δ <sup>−</sup> = Δ− {1, . . . , } *for* 0 ≤ ≤ , 0 ≤ ≤ *.*

It should be stressed that the proof of this theorem requires only applications of axioms and cut (see Indrzejczak, 2018b). Informally, it shows that for any sequent we can provide diferent rules which are interderivable with it. Premisses of these rules are obtained either by deleting some formula from the antecedent of the respective sequent and putting it in the succedent (of the respective premiss), or conversely, by deleting a formula in the succedent and putting it into the antecedent of a premiss. The conclusion of such a rule is provided by what remains intact in the input sequent. Let us see what kind of rules can be generated on the basis of LP (or EP if we wish), expressed as a sequent (1=) <sup>1</sup> = 2, [/1] ⇒ [/2]. We obtain the following equivalent rules:

$$\begin{array}{ll} \text{(2-)} & \frac{\varphi\left[\mathbf{x}/\tau\_{2}\right], \Gamma \Rightarrow \Delta}{\tau\_{1} = \tau\_{2}, \varphi\left[\mathbf{x}/\tau\_{1}\right], \Gamma \Rightarrow \Delta} \\\\ \text{(4-)} & \frac{\Gamma \Rightarrow \Delta, \tau\_{1} = \tau\_{2} \\\\ \text{(4-)} & \frac{\varphi\left[\mathbf{x}/\tau\_{1}\right], \Gamma \Rightarrow \Delta, \varphi\left[\mathbf{x}/\tau\_{2}\right]}{\varphi\left[\mathbf{x}/\tau\_{1}\right], \Gamma \Rightarrow \Delta, \varphi\left[\mathbf{x}/\tau\_{2}\right]} \end{array} \qquad \text{(5-)} \quad \frac{\Gamma \Rightarrow \Delta, \tau\_{1} = \tau\_{2} \quad \Pi \Rightarrow \Sigma, \varphi\left[\mathbf{x}/\tau\_{1}\right]}{\Gamma, \Pi \Rightarrow \Delta, \Sigma, \varphi\left[\mathbf{x}/\tau\_{2}\right]}$$
 \\ \text{(6-)} & \frac{\Gamma \Rightarrow \Delta, \tau\_{1} = \tau\_{2} \quad \varphi\left[\mathbf{x}/\tau\_{2}\right], \Pi \Rightarrow \Sigma \\\\ \text{(7-)} & \frac{\varphi\left[\mathbf{x}/\tau\_{1}\right], \Gamma, \Pi \Rightarrow \Delta, \Sigma \\\\ \text{(8-)} & \frac{\Gamma \Rightarrow \Delta, \varphi\left[\mathbf{x}/\tau\_{1}\right]}{\tau\_{1} = \tau\_{2}, \Gamma, \Pi \Rightarrow \Delta, \Sigma} \end{array} \qquad \text{(1-)}

$$\varphi(8=) \begin{array}{c} \Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2 \qquad \Pi \Rightarrow \Sigma, \varphi\left[\mathbf{x}/\tau\_1\right] \qquad \varphi\left[\mathbf{x}/\tau\_2\right], \Lambda \Rightarrow \Theta \\\hline \Gamma, \Pi, \Lambda \Rightarrow \Delta, \Sigma, \Theta \end{array}$$

Each of them may be used to express LP as a rule of SC, and in fact all were used for that. For example, Negri and von Plato (2001) applied (2=) (but with the repetition of active formulae in the premiss to save admissibility of contraction — see the rule (RE) above) and Manzano (1999) prefers its dual form (3). Some authors, for example Parlamento and Previale (2019) used both (2=) and (3=) although it is redundant. Note also that both (2) and (3) may be seen as special forms of (SUB) (with singular antecedent and succedent containing only ). (4) was applied by Reeves (1987), although in the framework of tableaux and in an apparently diferent way. He considers rules modelled not on LP but on CP, so the general schema would be rather:

$$\text{(CP)}\qquad\qquad\qquad\frac{\Gamma\_1 \Rightarrow \Delta\_1, \tau\_1 = \tau\_1' \qquad \dots \qquad \Gamma\_n \Rightarrow \Delta\_n, \tau\_n = \tau\_n'}{\varphi(\tau\_1, \dots, \tau\_n), \Gamma\_1, \dots \quad \Gamma\_n \Rightarrow \Delta\_1, \dots \quad \Delta\_n, \varphi(\tau\_1', \dots \quad \tau\_n')}$$

and similarly for CT. Moreover, it deals with tableaux so the rules are branching downwards and there are no sequents but formulae as nodes.

Indrzejczak (2019) used (5=), whereas Baaz and Leitsch (2011) both (5=) and (6=) which is its dual. Nagashima (1966) used (7=) and Indrzejczak (2018a) (8=).

Let us discuss these rules in the light of properties required from well-behaved SC rules. Although all these rules are separate most of them are rather not satisfactory with respect to other features. Only (2=), (3=) and (7=) are weakly symmetric and explicit, in the sense that they may be treated as equality introduction rules. Together with REF treated as 0-premiss rule we may even say that such pair is symmetric. Only (2=) and (3=) satisfy the subformula property. In the remaining cases this property holds in the generalised sense: in any (cut-free) proof either subformulae of the proven sequent or atomic formulae occur.

What with cut elimination? Before we answer this question it should be established also in what form the refexivity of equality is represented in the system. On the basis of the rule generation theorem, the only nontrivial rule (except axiomatic sequent REF) is the above mentioned:

The Logicality of Equality 223

$$\text{(REP)}\qquad\qquad\qquad\qquad\qquad\qquad\frac{\tau=\tau,\Gamma\Rightarrow\Delta}{\Gamma\Rightarrow\Delta}$$

which is used by Negri and von Plato (2001) (but also by Nagashima, 1966 and Gallier, 1986), whereas other authors prefer axiomatic sequents. However this is not an introduction but rather elimination rule. Of course we can think also of making use of Kalish and Montague (1964) solution from their ND system. Let us consider the possibility of application of the rule generation theorem to a sequent:

,

$$\forall \mathbf{x} (\mathbf{x} = \tau \to \varphi) \implies \varphi \{ \mathbf{x} / \tau \} \text{, where } \mathbf{x} \text{ is not free in } \tau.$$

corresponding to the refexivity axiom. It may be expressed as a rule in the following ways:

$$\begin{array}{c} (1) \quad \frac{\varphi\left[\mathbf{x}/\tau\right], \Gamma \Rightarrow \Delta}{\forall \mathbf{x} \left(\mathbf{x} = \tau \to \varphi\right), \Gamma \Rightarrow \Delta} & \quad \quad \quad \quad \quad \quad (2) \quad \frac{\Gamma \Rightarrow \Delta, \forall \mathbf{x} \left(\mathbf{x} = \tau \to \varphi\right)}{\Gamma \Rightarrow \Delta, \varphi \left[\mathbf{x}/\tau\right]} \\\\ (3) \quad \frac{\Gamma \Rightarrow \Delta, \forall \mathbf{x} \left(\mathbf{x} = \tau \to \varphi\right) \quad \varphi\left[\mathbf{x}/\tau\right], \Pi \Rightarrow \Sigma}{\Gamma, \Pi \Rightarrow \Delta, \Sigma} & \end{array}$$

Moreover, variants (2) and (3) may be improved in a way which dispenses with other constants:

$$(2')\ \frac{a=\tau,\Gamma\Rightarrow\Delta,\varphi\left[\mathbf{x}/a\right]}{\Gamma\Rightarrow\Delta,\varphi\left[\mathbf{x}/\tau\right]}\qquad(\Im')\ \frac{a=\tau,\Gamma\Rightarrow\Delta,\varphi\left[\mathbf{x}/a\right]\quad\varphi\left[\mathbf{x}/\tau\right],\Pi\Rightarrow\Sigma}{\Gamma,\Pi\Rightarrow\Delta,\Sigma}$$

where is not free in , Γ, Δ.

Such a rule is closer to the ordinary way of defning rules in SC setting since it is separated in a sense that no other constant is present in the schema. On the other hand, it is an elimination, not an introduction rule, similarly as (REP).

The last option for a nontrivial rule expressing refexivity of equality is Read's rule from ND presented in the previous section. It may be used also in SC framework for formalization of refexivity. It looks like this:

$$\frac{\varphi\left[\mathbf{x}/\tau\_1\right], \Gamma \Rightarrow \Delta, \varphi\left[\mathbf{x}/\tau\_2\right]}{\Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2},$$

where is a predicate not present in Γ, Δ.

Such a rule is considered also by Parlamento and Previale (2019) whereas Restall (2020) considers a rule that is closer to Więckowski's solution (in fact he considers a stronger rule — see the next section):

$$\varphi(\implies) \qquad \qquad \frac{\varphi\left[\mathbf{x}/\tau\_1\right], \Gamma \Rightarrow \Delta, \varphi\left[\mathbf{x}/\tau\_2\right] \qquad \varphi\left[\mathbf{x}/\tau\_2\right], \Gamma \Rightarrow \Delta, \varphi\left[\mathbf{x}/\tau\_1\right]}{\Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2}$$

where is a predicate not present in Γ, Δ.

These two rules seem to be the best choice from the syntactical point of view, since they are nontrivial equality introduction rules. Moreover they are separate, weakly symmetric (together with (2=), (3=) or (7=) even symmetric) and explicit. They also satisfy the subformula property in the generalised sense. Note also that for

any simple applied language with primitive predicates, these rules may be replaced with rules which do not refer to fresh predicate parameters. Consider a language having only unary predicates, then a suitable rule will have just (Read's variant) or 2 (Restall's variant) premisses. Of course in the case of languages having -ary predicate constants (for > 1) the situation is more complicated since for every such predicate we must have - (Read variant) or 2-premisses which take into account all positions of 1, <sup>2</sup> as arguments of this predicate, exactly as in Quine's counterpart of LA.

Now let us consider which combinations of rules yield cut-free SC. At frst we consider only how rules (2=)-(8=) behave in this respect together with (REF) or (REP). It is well known, as shown in Negri and Plato (2001), that SC with (REP) is cut free. In fact, also (2=) allows for cut elimination in LK again only with (REP). A similar situation holds for (6=). On the other hand (5=), (7=) and (8=) provide cut-free LK independently of the choice of (REP) or (REF). For systems with (3=) and (4=) it is not clear if cut elimination can be constructively proved. Consider the following:

$$(\mathfrak{B} =) \begin{array}{c} \Gamma \Rightarrow \Delta, a = c \\ \hline a = b, \Gamma \Rightarrow \Delta, b = c \end{array} \quad \begin{array}{c} \Pi \Rightarrow \Sigma, Ab \\ \hline b = c, \Pi \Rightarrow \Sigma, Ac \\ \hline a = b, \Gamma, \Pi \Rightarrow \Delta, \Sigma, Ac \end{array} (\mathsf{Cut})$$

It is neither possible to reduce the height of this cut or the complexity of cut-formula in the standard manner. A similar counterexample may be easily provided for (4=). Of course one can easily notice that in such cases the problem is connected with the fact that equalities are allowed as instances of in schemata of the respective rules. Of course if we think of rules for equality not only satisfying some desirable syntactic criteria for logicality, but also as being in a sense defnitions of this constant, it would be desirable to restrict instances of to atomic formulae other than equalities. But then neither symmetry nor transitivity of equality can be proved. In fact, this holds for all seven rules considered in connection with (REF) or (REP).

Although most of the systems are cut-free, taking into account other properties, the only reasonable candidate for our aim is LK with (REF) and (7=). In the remaining combinations at least one rule is not a rule of equality introduction. Still (REF) is also rather a poor candidate for our aim. So eventually we should take (7) and (⇒ =) (or its one-premiss version due to Read) as the pair of rules which, at least at the frst sight, look better. In the next section we examine LK with such a pair of rules in the light of criteria of logicality proposed by Hacking.

#### **5 Hacking's criterion**

Hacking (1979) based his considerations on the criteria of logicality on the standard form of SC with canonical logical and structural rules5. He did not consider equality

<sup>5</sup> Although he is using sequents built from fnite sets so contraction rules are dispensable.

but, as we will show, his analysis may be applied to this constant. In fact, Hacking follows closely Gentzen's suggestions concerning rules as possible defnitions of constants. Hence rules of this kind have to satisfy all the properties we discussed in the preceding section with special attention paid to the subformula property. As we noticed, the two rules we have chosen satisfy this property only in the generalised sense but it seems that this generalisation is reasonable. However, this is not enough. In order to satisfy Hacking's requirements of logicality it should be possible to show three elimination theorems for a system consisting of such rules:


The last one is Gentzen's famous Hauptsatz, whereas the remaining ones have a slightly weaker character since they do not postulate the complete elimination of purely logical axioms or weakening but only their reduction to the atomic level. In fact, if we consider purely logical versions of SC, i.e., without primitive structural rules, like G3 6, then atomic axioms are present as primitive and all these requirements may be presented in a more uniform way as respective admissibility results. However, we opt for LK as our basis and check how it behaves when enriched with equality rules.

We start with SC for pure and applied FOLI. As we concluded the preceding section the only reasonable candidates are (7=) as the antecedent introduction rule, and Read's or Restall's rule as the succedent introduction rule. Accordingly we consider two variants, and in both (7=) will be taken (possibly in two symmetric versions) as the antecedent introduction rule. LKI1 is LK with:

$$(\mathsf{I} \Longrightarrow \mathsf{o}) \qquad \qquad \frac{\Gamma \Rightarrow \Delta, \varphi[a/\tau\_1] \qquad \varphi[a/\tau\_2], \Pi \Rightarrow \Sigma}{\tau\_1 = \tau\_2, \Gamma, \Pi \Rightarrow \Delta, \Sigma}$$

$$\varphi(1\Longrightarrow) \qquad\qquad\qquad\frac{\varphi\{a/\tau\_1\},\ \Gamma\Rightarrow\ \Delta,\ \varphi\{a/\tau\_2\}}{\Gamma\Rightarrow\Delta,\ \tau\_1=\tau\_2}$$

where in the latter rule is an atomic predicate not in Γ, Δ (in antecedent introduction rules it is an arbitrary atomic formula).

Whereas in LKI2 we have instead:

$$(\mathsf{I} \Longrightarrow) \qquad \qquad \frac{\Gamma \Rightarrow \Delta, \varphi\left[a/\tau\_1\right] \qquad \varphi\left[a/\tau\_2\right], \Pi \Rightarrow \Sigma}{\tau\_1 = \tau\_2, \Gamma, \Pi \Rightarrow \Delta, \Sigma}$$

$$(2 \Longrightarrow) \qquad \qquad \frac{\Gamma \Rightarrow \Delta, \varphi\left[a/\tau\_2\right] \qquad \varphi\left[a/\tau\_1\right], \Pi \Rightarrow \Sigma}{\tau\_1 = \tau\_2, \Gamma, \Pi \Rightarrow \Delta, \Sigma}$$

<sup>6</sup> See, e.g., Troelstra and Schwichtenberg (1996) or Negri and von Plato (2001).

$$(2\Longrightarrow =)\qquad\qquad\underbrace{\varphi\left[a/\tau\_1\right],\Gamma\Rightarrow\Delta,\varphi\left[a/\tau\_2\right]\qquad\varphi\left[a/\tau\_2\right],\Pi\Rightarrow\Sigma,\varphi\left[a/\tau\_1\right]}\_{\Gamma,\Pi\Rightarrow\Delta,\Sigma,\ \tau\_1=\tau\_2}$$

where in the latter rule is an atomic predicate not in Γ, Δ.

The distinction between LKI1 and LKI2 follows from the way we treat equalities. If they are treated as atomic formulae, LKI1 is sufcient; otherwise we need LKI2. The key point is how to prove symmetry in both systems; in the former we have the following proof:

$$\begin{pmatrix} (1 \Longrightarrow =) \xrightarrow{A \ \tau\_1 \implies A \ \tau\_1} & \tau\_2 = \tau\_1 \implies \tau\_2 = \tau\_1\\ (1 \Longrightarrow ) \xrightarrow{\implies \tau\_1 = \tau\_2} & \tau\_1 = \tau\_2 = \tau\_1 \end{pmatrix}$$

where [/] is = <sup>1</sup> [/].

In LKI2 it looks like that:

$$\begin{array}{c} (2 \Longrightarrow) \begin{array}{c} A\tau\_{2} \Rightarrow A\tau\_{2} \quad \quad A\tau\_{1} \Rightarrow A\tau\_{1} \\ \tau\_{1} = \tau\_{2}, A\tau\_{2} \Rightarrow A\tau\_{1} \end{array} \quad \begin{array}{c} A\tau\_{1} \Rightarrow A\tau\_{1} \quad \quad A\tau\_{2} \Rightarrow A\tau\_{2} \\ \tau\_{1} = \tau\_{2}, A\tau\_{1} \Rightarrow A\tau\_{2} \end{array} (1 \Longrightarrow) \end{array}$$

Both systems are complete and it is not difcult to extend them to cover complex terms. However suitable rules of the antecedent introduction for equality must be modifed in the way which ensures derivability of CT. It is only necessary to require that in two atomic formulae which are side formulae in both premisses the respective occurrences of terms being arguments of principal formula, may appear not only as arguments of this predicate but also as arguments of complex terms being its arguments. Under this generalised understanding a proof of the simplest case of CT in LKI2 for some unary operation looks like this:

$$(1 \Longrightarrow) \begin{array}{l} \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}} \Rightarrow \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}} \qquad \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}} \Rightarrow \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}}\\ (2 \Longrightarrow=) \begin{array}{l} \tau\_{\mathsf{l}} = \tau\_{2}, \mathit{Af}\tau\_{\mathsf{l}} \Rightarrow \mathit{Af}\tau\_{\mathsf{l}} \end{array} \qquad \frac{\begin{array}{l} \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}} \Rightarrow \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}} \qquad \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}} \Rightarrow \mathit{Af}\tau\mathsf{r}\_{\mathsf{l}}\\ \tau\_{\mathsf{l}} = \tau\_{2}, \mathit{Af}\tau\_{\mathsf{l}} \Rightarrow \mathit{Af}\tau\_{\mathsf{l}} \end{array} (2 \Longrightarrow) \end{array}$$

There are other solutions which work as well as LKI2 but are simpler. A close inspection of proofs needed to prove completeness shows that it is also possible to obtain two variants of LKI1 keeping the proviso concerning equalities in LKI2 (i.e., that they are not treated as atoms being possible instances of the antecedent introduction rule). Instead of one 2-premiss rule (2⇒=) we can add two one-premiss rules: It is enough either to use both rules of the antecedent introduction, as in LKI2, or to add a symmetric version of (1⇒=) (corresponding to the right premiss of (2⇒=)):

$$(\mathbf{l} \Rightarrow \mathbf{='}) \qquad\qquad\qquad\qquad \frac{\varphi\left[a/\tau\_2\right], \; \Gamma \Rightarrow \Delta, \; \varphi\left[a/\tau\_1\right])}{\Gamma \Rightarrow \Delta, \; \tau\_1 = \tau\_2}$$

(with the same proviso concerning fresh ). It may be easily checked that suitable proofs of symmetry and transitivity may be obtained from the proofs in LKI2 stated above by deleting derivations of one branch. This provides an adequate SC for FOL with equality and simplifes many proofs since the branching factor is lower.

The Logicality of Equality 227

The last thing concerns the question how these variants of SC fare with respect to Hacking's criteria of logicality. We have noticed above that the subformula property is generalised but in the sense which is acceptable. As for eliminability conditions one may easily check that the frst two hold for all stated versions of LKI. What with cut elimination? First of all note that in this respect LKI1 is not a suitable system. Consider the case where the cut formula is an equality which is a principal formula of the last applied rule, and moreover side formulae of (1=⇒) application were equalities:

$$\begin{array}{c} \varphi(\tau\_1), \Gamma \Rightarrow \Delta, \varphi(\tau\_2) \\ \hline \Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2 \\ \hline \end{array} \quad \begin{array}{c} \Pi \Rightarrow \Sigma, \tau\_1 = \tau\_3 \qquad \tau\_2 = \tau\_3, \Pi \Rightarrow \Sigma \\ \hline \tau\_1 = \tau\_2, \Pi \Rightarrow \Sigma \\ \hline \end{array}$$

In this case, there is no possibility of making a reduction either on the height or on the degree (complexity) of cut formula.

Perhaps LKI2, or modifed versions of LKI1 with additional rule but excluding equality as atomic, are better. Again consider the case where the cut formula is an equality which is a principal formula of the last applied rule:

$$\begin{array}{c} \varphi(\tau\_1), \Gamma \Rightarrow \Delta, \varphi(\tau\_2) \qquad \varphi(\tau\_2), \Gamma \Rightarrow \Delta, \varphi(\tau\_1) \\\hline \hline \Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2 \\\hline \end{array} \quad \begin{array}{c} \Pi \Rightarrow \Sigma, \psi(\tau\_1) \qquad \psi(\tau\_2), \Pi \Rightarrow \Sigma \\\hline \tau\_1 = \tau\_2, \Pi \Rightarrow \Sigma \\\hline \end{array}$$

(The argument for the right premiss deduced by (2=⇒) is symmetric). We assume that all atomic formulae have complexity 0 but equalities have complexity 1 since = is treated on a par with other logical constants. In order to apply induction on the complexity of the cut formula we should however frst unify side formulae in the premisses of both applications of equality rules. The situation is analogous to the case of reduction made when the cut formula is introduced by quantifer rules. In this case we frst apply substitution to fresh parameter of the premiss of (⇒ ∀) (or (∃⇒)) and then we can make a reduction by making cut on the premisses. In the present case, if a similar kind of substitution is possible, the following reduction of cut degree would do:

$$\frac{\begin{array}{c} \Pi \Rightarrow \Sigma, \psi(\tau\_{1}) \qquad \psi(\tau\_{1}), \Gamma \Rightarrow \Delta, \psi(\tau\_{2})\\ \hline \Pi, \Gamma \Rightarrow \Sigma, \Delta, \psi(\tau\_{2}) \end{array} \begin{array}{c} \begin{array}{c} \Pi, \psi(\tau\_{2})\\ \hline \end{array} \begin{array}{c} \begin{array}{c} \psi(\tau\_{2}), \Pi \Rightarrow \Sigma \end{array} \end{array} \end{array} \end{array} $$

where in the middle sequent (1), (2) were substituted for unique occurrences of (1), (2) in the leftmost premiss of the previuos fgure. The problem is that no corresponding result for substitution of fresh atomic formulae can be proved in the presence of rules for antecedent introduction. Consider the following example which shows the source of the problem:

$$\frac{\tau\_3 = \tau\_2, \varphi(\tau\_1), \Gamma\_1 \Rightarrow \Delta\_1, \varphi(\tau\_2)}{\tau\_3 = \tau\_2, \varphi(\tau\_2), \Gamma\_2 \Rightarrow \Delta\_2, \varphi(\tau\_1)}$$

Assuming that it is a proof of the left premiss of the considered cut application we cannot change for since the middle top sequent will not be an axiom.

This shows that in case of pure and applied FOLI our solution does not satisfy the most important condition in the list of logicality criteria provided by Hacking. Since it is not applicable to simple FOLI there is only one possibility — that it works for simple applied versions of FOLI. Certainly it works for all languages having only one unary predicate constant. In this case all the rules of LKI2 (or variants of LKI1) have the same shape but with () being always and with no side condition for the succedent introduction rules. All proofs are intact, moreover, the problem connected with the proof of cut elimination does not hold since there is only one atomic formula in both premisses of cut, i.e., both () and () are and the reduction of cut degree holds. What with languages having richer signature? Following Quine's recipe mentioned in Section 3 we must suitably modify both rules of succedent introduction: In the case of (1⇒=) for each -ary predicate constant we must use premisses with respective term as an argument in all positions, for (2⇒=) we must introduce 2 of such premisses. For example suitable form of (1⇒=) for the language with unary and binary which was considered in Section 3 for the illustration of LA′ looks like that:

$$\frac{A\tau\_1, \Gamma\_1 \Rightarrow \Delta\_1, A\tau\_2 \qquad R\tau\_1a, \Gamma\_2 \Rightarrow \Delta\_2, R\tau\_2a \qquad Ra\tau\_1, \Gamma\_3 \Rightarrow \Delta\_3, Ra\tau\_2}{\Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2}$$

where is not in Γ, Δ, 1, 2.

Rules for the antecedent introduction remain intact but is now an instance of an arbitrary predicate constant with at least one occurrence of suitable term. Such versions of SC with succedent introduction rules having no fxed number of premisses but relative to the signature are not very satisfactory. In particular, from the proofsearch point of view in the case of richer signature the branching factor is too big to use them in practice. However, we are not concerned here with practical application but with satisfability of theoretical desiderata and from this point of view this solution works. In particular, cut elimination can be proved since every application of (2⇒=) (and (1⇒=) in variants of LKI1) has always among its premisses the one which has identical side formula as the ones occuring in actual premisses of (1=⇒) or (2=⇒).

#### **6 Došen's criterion**

In this section we focus on the criterion of logicality proposed by Došen. It can be seen as a successful refnement of a proposal due to Popper (1947a; 1947b) concerned with some conception of proof-theoretical semantics which was however not articulated in a satisfactory way. Popper tried to characterize constants by means of inferential defnitions which yield double-valid rules characterising constants of the form:

$$(\rightarrow) \begin{array}{c} \varphi, \chi \Rightarrow \psi \\ \hline \chi \Rightarrow \varphi \rightarrow \psi \end{array} \qquad\qquad\qquad\qquad(\wedge) \begin{array}{c} \frac{\varphi, \psi \Rightarrow \chi}{\varphi \land \psi \Rightarrow \chi} \\ \hline \end{array}$$

The Logicality of Equality 229

$$(\lor) \; \frac{\varphi \Rightarrow \chi \quad \Downarrow \implies \chi}{\varphi \lor \psi \Rightarrow \chi} \qquad\qquad\qquad (\neg) \; \frac{\varphi, \Downarrow \implies \chi}{\neg \chi, \psi \Rightarrow \neg \varphi}$$

Popper's project was criticised by Kleene, Curry, and many others, however it was convincingly shown by Schroeder-Heister (1984) (see also Schroeder-Heister, 2006 and Binder and Piecha, 2021) that his works contain an interesting proposal for a slightly weaker plan of providing criteria for being a logical constant. Such an enterprise was undertaken much later by Došen, frst in his doctoral thesis and then in Došen (1989). It seems that if we do not use them as a way for establishing a variant of proof theoretic semantics but only as an independent criterion of logicality it works well. In fact, the opinions on the very nature of the relationships between criteria of logicality and proof-theoretic semantics difer strongly. Despite of Došen's own remarks that in his proposal rules are not intended as defnitions of constants7 it may be treated as a good framework even for developing proof-theoretic semantics. The idea of the application of double-line rules as defnitional rules reappeared in such signifcantly diferent frameworks like Koslow's structural approach to logics (see Koslow, 1992), Sambin's Basic Logic (see Sambin, Battilotti, and Faggian, 2000), or categorical logic (see Maruyama, 2016). Moreover, this strong interpretation of Došen's proposal is provided independently by Gratzl and Orlandelli (2017) and by Restall (2019). In particular, the former work proposes an interesting explanation of the reasons for such a choice in terms of harmony. Since the 1960s, harmony was treated as crucial for the explanation of proper rules for defning logical constants, and a lot of work was ofered in which a clarifcation of this notion was provided8. Harmony is in general understood as a kind of balance between two kinds of rules, introduction and elimination in ND, or antecedent and succedent introduction in SC. However, this notion is explicated in many diferent ways. Gratzl and Orlandelli (2017) proposed an explanation of harmony as a kind of deductive equilibrium which was frst proposed by Tennant (2010). Their approach may be seen as an improvement of Tennant's solution in the sense that it is purely local, i.e., an analysis of a constant in terms of rules is independent of what other constants are already present in the language. Moreover, in contrast to other approaches to harmony it allows for the unique determination of one kind of rules from the other kind and vice versa.

Došen's system described below serves as an exemplifcation of his theory of criteria of logicality. The starting point of analysis of logical constants is the conviction that logic is the science of formal proofs. Hence basic formal proofs are of purely structural character, i.e., where only structural rules were applied9. It follows that an expression is logical if it is analysable in purely structural terms. As he emphasized in the title of his paper — logical constants are punctuation marks. An analysis should satisfy three conditions:

<sup>7</sup> In particular, the proposed criteria do not necessarily satisfy the criteria of eliminability and non-creativity required from the well-stated defnitions.

<sup>8</sup> See, e.g., Schroeder-Heister (2012), Poggiolesi (2011) or Kürbis (2019).

<sup>9</sup> It is in a sense a development of Hertz's programme (see Hertz, 1929).


His rules provide such an analysis in the sense that on the one side we have only structural sequents, i.e., with no constant displayed. According to Došen, in order to claim that an expression is a logical constant it is necessary to fnd such a double-valid rule which after addition to structural rules allows for obtaining a full characterisation of this constant. Došen (1989) proposed a structural version of LK, which we will call SLK, in the language without negation but with ⊥ and ⊤ but we introduce a rule for negation instead. In this system the set of structural rules is primitive and not eliminable. Every constant is characterised by means of only one, but double-line (i.e., invertible) rule:

$$(\rightarrow) \begin{array}{c} \varphi, \Gamma \Rightarrow \Delta, \; \psi\\ \hline \Gamma \Rightarrow \Delta, \varphi \rightarrow \psi \end{array} \qquad (\land) \begin{array}{c} \begin{array}{c} \Gamma \Rightarrow \Delta, \varphi\\ \hline \Gamma \Rightarrow \Delta, \varphi \land \psi \end{array} \qquad \qquad (\neg) \begin{array}{c} \begin{array}{c} \varphi, \; \Gamma \Rightarrow \Delta\\ \hline \Gamma \Rightarrow \Delta, \neg \varphi \end{array} \end{array} \end{array}$$

$$(\lor) \begin{array}{c} \varphi, \Gamma \Rightarrow \Delta \\ \hline \varphi \lor \psi, \Gamma \Rightarrow \Delta \end{array} \qquad (\lor)^{1} \begin{array}{c} \Gamma \Rightarrow \Delta, \varphi[\mathbf{x}/a] \\ \hline \Gamma \Rightarrow \Delta, \forall \mathbf{x}\varphi \end{array} \qquad (\exists)^{1} \ \begin{array}{c} \varphi[\mathbf{x}/a], \Gamma \Rightarrow \Delta \\ \hline \exists \mathbf{x}\varphi, \Gamma \Rightarrow \Delta \end{array}$$

where in the latter two rules is not in Γ, Δ, . In each case in addition to the rule of introduction we have also a rule of elimination if we read the rule upside down. Every rule is then a counterpart of a suitable equivalence characterising the respective constant within Scott's theory of consequence relations (see Scott, 1974). In what follows we will use notation (↓→) and (↑→) for suitable halves of the rule for implication and similarly for other constants.

The frst condition is obviously satisfed for all rules. The second one can be shown by providing a proof of equivalence with some standard version of SC which is known to be adequate; our version of LK is sufcient. Since one half of each of Došen's rules corresponds to a suitable introduction rule such a proof amounts in principle to demonstration of derivability of the remaining rules by means of structural rules only. For example if we take his rule for implication, then (↓→) is exactly our (⇒→) and the following shows derivability of (→⇒) by means of (↑→):

$$\begin{array}{c} \left(\uparrow\rightarrow\right) \xrightarrow{\varphi\rightarrow\psi\Rightarrow\varphi\rightarrow\psi} \begin{array}{c} \varphi\rightarrow\psi\Rightarrow\varphi\rightarrow\psi\\ \hline \varphi,\varphi\rightarrow\psi\Rightarrow\psi \end{array} \quad\begin{array}{c} \left(\varphi,\Pi\Rightarrow\Sigma\\ \hline \varphi,\varphi\rightarrow\psi,\Pi\Rightarrow\Sigma\\ \hline \end{array} \left(\text{Cut}\right) \end{array} \left(\text{Cut}\right) \end{array}$$

whereas the converse derivability goes like that:

$$\begin{array}{c} \Gamma \Rightarrow \Delta, \varphi \rightarrow \psi \quad \frac{\varphi \Rightarrow \varphi \quad \quad \psi \Rightarrow \psi}{\varphi \rightarrow \psi, \varphi \Rightarrow \psi} (\stackrel{\longrightarrow}{\mbox{Cut}}) \\\hline \varphi, \Gamma \Rightarrow \Delta, \psi \end{array}$$

Note that both demonstrations of derivability are structural and moreover in addition to the respective logical rules they use only axioms and cut. Such demonstration of interderivability is sufcient to show that rules of LK are harmonious in the sense explained by Gratzl and Orlandelli (2017). One may easily check that the remaining halves of Došen's rules are also interderivable with (∧⇒), (¬⇒), (⇒∨), (∀⇒) and (⇒∃). However, in the case of interderivability of (∧⇒) and (⇒∨) with (↑∧) and (↓∨) we must additionally use contraction and weakening. Derivations are still structural but some authors tend to be careful with that and admit as fully satisfactory analyses only those where axioms and cut are the only rules (see, e.g., Gratzl and Orlandelli, 2017). However, it is easy to provide a remedy. In fact, Došen's rules were based on the original Gentzen's rules for LK which are additive and for these rules interderivability requires only axioms and cut. On the other hand we have chosen all rules to be multiplicative. If we change Došen's rules for conjunction and disjunction into:

$$(\land) \; \frac{\varphi, \psi, \Gamma \Rightarrow \Delta}{\varphi \land \psi, \Gamma \Rightarrow \Delta} \; \qquad \qquad \qquad (\lor) \; \frac{\Gamma \Rightarrow \Delta, \varphi, \psi}{\Gamma \Rightarrow \Delta, \varphi \lor \psi}$$

we can provide interderivability proofs for our version of LK by means of axioms and cut only.

On the other hand note that showing derivability of (∀⇒) and (⇒ ∃) in structural variant requires additional structural rule of substitution:

(SUB) Γ⇒ Δ Γ[/] ⇒ Δ[/]

which is necessary to enable unrestricted instantiation (modulo correct substitution) of terms in these two rules.

The last condition, i.e., uniqueness may be demonstrated as the provability for each constant ∗ of two sequents: ∗ ⇒ ★ and ⇒ ∗, where ★ is a notational variant having the same rule. Suitable proofs are trivial in this setting (although not in general — see, e.g., Došen, 1985).

How can equality be characterised in this framework? In fact, Došen proposed a rule which is of global character:

$$\left( \begin{array}{c} \\ \\ \end{array} \right) \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \frac{\Gamma\left[ a/\tau\_{1} \right] \Longrightarrow \Delta\left[ a/\tau\_{1} \right]}{\tau\_{1} = \tau\_{2}, \; \Gamma \Rightarrow \Delta}$$

In Section 2 we explained why such kind of rules cannot be treated as providing a criterion of logicality. What kind of local rules can be used instead? The obvious candidates are double versions of the rules for succedent introduction which we examined in the last section:

$$\varphi(\mathsf{l} =) \qquad \qquad \qquad \qquad \frac{\varphi[a/\tau\_1], \ \Gamma \Rightarrow \ \Delta, \ \varphi[a/\tau\_2]}{\Gamma \Rightarrow \Delta, \ \tau\_1 = \tau\_2}$$

$$\varphi(2=) \qquad \qquad \frac{\varphi\left[a/\tau\_1\right], \Gamma \Rightarrow \Delta, \varphi\left[a/\tau\_2\right] \qquad \varphi\left[a/\tau\_2\right], \Gamma \Rightarrow \Delta, \varphi\left[a/\tau\_1\right]}{\Gamma \Rightarrow \Delta, \tau\_1 = \tau\_2}$$

where in both rules is atomic predicate not in Γ, Δ. The frst one is an obvious SC version of Read's rule and the second of Restall's one. Let us consider a pure or applied FOLI as formalised by Došen's SLK in two variants: SLK1 (with (1=)) and SLK2 (with (2=)). We start with the latter, moreover we add, similarly as in LKI2, the proviso that equalities are not atomic predicates, so only predicate parameters (and other predicate constants in applied version) are admitted as instances of . This system satisfes the frst condition so we must check the second, i.e., adequacy. Both directions of (2=) are cases of (2⇒=) and (4=) respectively so SLK2 is sound. Of course we can also easily provide demonstrations of the interderivability of (1=↑) and (2=↑) with (1=⇒) and (2=⇒) which is enough to show that the equality rules of LKI1 and LK2 are harmonious in the sense of Gratzl and Orlandelli (2017). To show that SLK (in both versions) is complete we must be able to prove the refexivity axiom and LP which are immediate:

$$\begin{array}{c} \left( \downarrow 2 \text{=} \right) \xrightarrow{A \tau \implies A \tau} \begin{array}{c} A \tau \implies A \tau \\ \Rightarrow \tau = \tau \end{array} \\\\ \left( \uparrow 2 \text{=} \right) \xrightarrow{\tau\_1 = \tau\_2 \implies \tau\_1 = \tau\_2} \\\\ \frac{\left( \uparrow 2 \text{=} \right) \xrightarrow{\tau\_1 = \tau\_2} \varphi \left[ a / \tau\_1 \right] \Rightarrow \varphi \left[ a / \tau\_2 \right]}{\left( \downarrow 2 \text{=} \right) \left[ a / \tau\_1 \right]} \end{array}$$

where is any atomic predicate (parameter or constant).

However, it is not sufcient. Since equalities cannot be instances of we must provide also proofs of symmetry and transitivity for =:

$$\begin{array}{llll}(2=\uparrow) & \frac{\tau\_1 = \tau\_2 \implies \tau\_1 = \tau\_2}{\tau\_1 = \tau\_2, A\tau\_2 \implies A\tau\_1} & \frac{\tau\_1 = \tau\_2 \implies \tau\_1 = \tau\_2}{\tau\_1 = \tau\_2, A\tau\_1 \implies A\tau\_2} \\ (2=\downarrow) & \frac{\tau\_1 = \tau\_2 \implies \tau\_2 = \tau\_1}{\tau\_1 = \tau\_2 \implies \tau\_2 = \tau\_1} \end{array} (2=\uparrow).$$

for transitivity we derive:

$$\begin{array}{llll} \left(2=\uparrow\right) & \frac{\tau\_1 = \tau\_2 \Rightarrow \tau\_1 = \tau\_2}{\tau\_1 = \tau\_2, A\tau\_1 \Rightarrow A\tau\_2} & \frac{\tau\_2 = \tau\_3 \Rightarrow \tau\_2 = \tau\_3}{\tau\_2 = \tau\_3, A\tau\_2 \Rightarrow A\tau\_3} \left(2=\uparrow\right) \\\ \left(\text{Cut}\right) & \frac{\tau\_1 = \tau\_2, \tau\_2 = \tau\_3, A\tau\_1 \Rightarrow A\tau\_3}{\tau\_1 = \tau\_2, \tau\_2 = \tau\_3, A\tau\_1 \Rightarrow A\tau\_3} \end{array}$$

and

$$\begin{array}{llll} (2=\uparrow) & \frac{\tau\_2 = \tau\_3 \implies \tau\_2 = \tau\_3}{\tau\_2 = \tau\_3, A\tau\_3 \implies A\tau\_2} & \frac{\tau\_1 = \tau\_2 \implies \tau\_1 = \tau\_2}{\tau\_1 = \tau\_2, A\tau\_2 \implies A\tau\_1} \\\ (\text{Cut}) & \frac{\tau\_1 = \tau\_2, \tau\_2 = \tau\_3, A\tau\_3 \implies A\tau\_1}{\tau\_1 = \tau\_2, \tau\_2 = \tau\_3, A\tau\_3 \implies A\tau\_1} \end{array} (2=\uparrow).$$

which together by (2=↓) yield <sup>1</sup> = 2, <sup>2</sup> = <sup>3</sup> ⇒ <sup>1</sup> = 3.

It is important to note that in both proofs the use of both premisses of (2=) is essential. It is not possible to provide a proof of symmetry and transitivity of = in SLK1 with the same proviso. On the other hand, if we admit equalities as possible instances of , SLK1 appears also to be an adequate formalization of pure or applied FOLI, similarly as LKI1. Proofs of symmetry and transitivity look like that:

$$\begin{array}{ll} \left(1=\downarrow\right) \xrightarrow{A\tau\_{1}\implies A\tau\_{1}} & \tau\_{1}=\tau\_{2}\implies \tau\_{1}=\tau\_{2} \\ \left(\text{Cut}\right) \xrightarrow{\implies \tau\_{1}=\tau\_{1}} & \tau\_{1}=\tau\_{1},\tau\_{1}=\tau\_{2}\implies \tau\_{2}=\tau\_{1} \\ \end{array} \\ \left(1=\uparrow\right) \xrightarrow{\begin{array}{l} \tau\_{1}=\tau\_{2}\implies \tau\_{2}=\tau\_{1} \\ \end{array}} \left(1=\uparrow\right) \xrightarrow{\begin{array}{l} \\ \end{array}}$$

where in the application of (1=↑) on the right [/] is = <sup>1</sup> [/].

$$\begin{aligned} \tau\_1 = \tau\_2 \Rightarrow \tau\_1 = \tau\_2 \quad & \frac{\tau\_2 = \tau\_3 \implies \tau\_2 = \tau\_3}{\tau\_1 = \tau\_2, \tau\_2 = \tau\_3 \implies \tau\_1 = \tau\_3} \text{ (1 = \uparrow)}\\ \hline \tau\_1 = \tau\_2, \tau\_2 = \tau\_3 \implies \tau\_1 = \tau\_3 \end{aligned} \text{ (1 = \uparrow)}$$

where in the application of (1=↑) on the right [/] is <sup>1</sup> = [/]. Since the proof of LP is correct also with (1=) and the proof of refexivity by (1=) is involved in the proof of symmetry above we are done. Also the proofs of uniqueness are directly obtainable in both versions of SLK. The above proofs show also that simple FOLI may be also formalised by means of SLK1; it is enough to change in the proof of refexivity ⇒ for = ⇒ = .

We leave the problem of formalization of simple applied versions of FOLI in SLK — it may follow the way described in the preceding section. The only diference is that now such many-premiss rules are treated as double valid. In particular, in the case of the language with just one unary predicate we can use just (2=) without proviso that is fresh.

There is no problem with the treatment of complex terms in the structural variant provided we will treat substitution of terms in the same way as in the systems of the preceding section. For illustration we can show how to provide a proof of CT for a binary operation of the form: ∀( = ∧ = → = )

We can prove:

$$\frac{(2=\uparrow)}{(2=\downarrow)} \frac{a=b \Rightarrow a=b}{a=b, A fac \Rightarrow A fibc} \quad \frac{a=b \Rightarrow a=b}{a=b, A fibc \Rightarrow A fac} \ (2=\uparrow)$$

In a similar way we prove = ⇒ = . Since = , = ⇒ = is provable as an instance of transitivity we obtain = , = ⇒ = by two applications of cut, and then the result by (∧), (→), (∀). This proof generalizes for any -ary operation.

We fnish this section with the remark that similarly as in the case of other constants, there is another possibility of characterizing equality by means of invertible rule. We can use the antecedent-based rule:

$$\varphi(\text{=3}) \qquad \qquad \frac{\Gamma \Rightarrow \Delta, \varphi[a/\tau\_1], \varphi[a/\tau\_2] \qquad \varphi[a/\tau\_1], \varphi[a/\tau\_2], \Gamma \Rightarrow \Delta}{\tau\_1 = \tau\_2, \Gamma \Rightarrow \Delta}$$

This rule has an advantage of having no side condition on except that it is atomic. However, showing that it satisfes Došen's criteria is slightly more involved and we do not pursue this task here.

#### **7 Final comments**

The results of our analysis do not provide a decisive answer to the problem that was posed concerning logicality of equality. They show that the status of equality as a

logical constant is dependent not only on the criteria which are taken into account but also sensitive to the version of the language to which equality is appended. In this respect the structural variant of Došen works better. All criteria of logicality hold for pure, applied and simple applied variants of FOLI introduced in Section 6. Moreover, invertible rules of this calculus allow us to show that standard introduction rules from Section 5 are harmonious in the sense advocated by Gratzl and Orlandelli (2017). However, in the latter case, i.e., SC realizing Hacking's approach, it is striking that the most important condition, namely cut eliminability, fails for the pure and applied version of FOLI. On the other hand it can be proved for many other sequent formalizations of FOLI but with non-canonical equality rules. Such rules however cannot be considered as proof-theoretic characterisations of a logical constant. Note also that simple FOLI does not satisfy the criteria of logicality in either formalization. Neither cut elimination holds for LKI variants, nor the frst condition of Došen's analysis holds if equality is a sole predicate.

Our analysis was provided in the setting of standard SC and its slightly nonstandard variant but still based on the standard notion of a sequent. One may ask if the application of some other generalised setting may show in a more decisive way that equality satisfes the criteria of logicality. For example, in Hacking (1979) modal operators are taken as a negative example since in the setting of standard SC they cannot be characterised by means of canonical rules. But one can fnd more satisfying solutions on the ground of generalised formalisations. For example, in the setting of display logic one can provide rules for modal operators satisfying Hacking's demands (see, e.g., Wansing, 1999).

One possible generalization which works for equality can be based on using sequents with terms occuring on a par with formulae. The idea of formal systems with terms treated as fully fedged elements of deductions is not new. For the frst time it was introduced by Jaśkowski (1934) in his frst system of ND. Quite recently the idea was independently undertaken and developed in the setting of ND by Textor (2017) and Gazzari (2019). In the framework of SC it was developed by Restall (2019) and in Indrzejczak (2021). It seems that in this slightly generalised setting not only equality but also several kinds of term-forming operators may be formalised in a way important for further development of proof-theoretic semantics.

In this study we have restricted considerations to classical FOLI but the results we obtained may be easily extended to Intuitionistic version by restricting sequents to single-succedent ones. It is an open problem how to adapt this kind of analysis to other predicates of similar character applied in nonclassical logics.

**Acknowledgements** I am greatly indebted to Nils Kürbis and the referees of this paper for many valuable remarks which helped to improve the fnal version. The results reported in this paper are supported by the National Science Centre, Poland (grant number: DEC-2017/25/B/HS1/01268).

#### **References**


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Eight Rules for Implication Elimination**

Michael Arndt

**Abstract** Eight distinct rules for implication in the antecedent for the sequent calculus, one of which being Gentzen's standard rule, can be derived by successively applying a number of cuts to the logical ground sequent → , ⇒ . A naive translation into natural deduction collapses four of those rules onto the standard implication elimination rule, and the remaining four rules onto the general elimination rule. This collapse is due to the fact that the diference between a formula occurring in the succedent of a premise of a sequent calculus rule and that formula instead occurring in the antecedent of the conclusion cannot be adequately expressed in the framework of natural deduction. In contrast to this, the diference between a formula occurring in the succedent of the conclusion of a sequent calculus rule and that formula instead occurring in the antecedent of a premise corresponds exactly to the distinction between the standard implication elimination rule and its general counterpart. This incongruity can be remedied by introducing a notational facility that requires of the particular premise of a rule to which it is attached to be an assumption, i.e., to prevent it from being the conclusion of another rule application. Applying this facility to implication elimination results in eight distinct rules that correspond exactly to the eight sequent calculus rules. These eight rules are presented and discussed in detail. It turns out that a natural deduction calculus (for positive implication logic) that employs the rule corresponding to the standard left implication rule of the sequent calculus as well as a rule for explicit substitution can be seen as a natural deduction style sequent calculus.

## **1 Introduction**

In a previous article we presented eight sequent calculus rules for implication in the antecedent (left implication) and discussed their properties (Arndt, 2019). That

Michael Arndt

© The Author(s) 2024 239

Department of Computer Science, University of Tübingen, Germany, e-mail: arndt@cs.uni-tuebingen.de

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_8

$$\begin{aligned} \overline{A \to B, A \to B} \quad (\rightharpoonup \mathbf{L}\_{000}) \\\\ \overline{A \to B, \Gamma \to B} \quad (\rightharpoonup \mathbf{L}\_{010}) \\\\ \overline{\frac{\Gamma \Rightarrow A \to B}{A, \Gamma \Rightarrow B} \quad (\rightharpoonup \mathbf{L}\_{100}) \\\\ \overline{\frac{\Gamma \Rightarrow A \to B}{\Gamma\_1, \Gamma\_2 \Rightarrow B} \quad (\rightharpoonup \mathbf{L}\_{110}) \\\\ \overline{\frac{B, \Gamma \Rightarrow \Lambda}{A \to B, A, \Gamma \Rightarrow \Lambda} \quad (\rightharpoonup \mathbf{L}\_{101}) \\\\ \overline{A \to B, A, \Gamma \Rightarrow \Lambda} \quad (\rightharpoonup \mathbf{L}\_{001}) \\\\ \overline{\frac{\Gamma\_1 \Rightarrow A}{A \to B, \Gamma\_1, \Gamma\_2 \Rightarrow A} \quad (\rightharpoonup \mathbf{L}\_{011}) \\\\ \Gamma\_1 \Rightarrow A \to B \qquad B, \Gamma\_2 \Rightarrow \Lambda \quad (\rightharpoonup \mathbf{L}\_{001}) \\\\ \Gamma\_1 \Rightarrow A \to B \qquad \Gamma\_2 \Rightarrow A \qquad B, \Gamma\_3 \Rightarrow \Lambda \quad (\rightharpoonup \mathbf{L}\_{101}) \\\\ \Gamma\_1 \Rightarrow \Lambda \Rightarrow B \qquad \Gamma\_2 \Rightarrow A \qquad B, \Gamma\_3 \Rightarrow \Lambda \quad (\rightharpoonup \mathbf{L}\_{111}) \end{aligned}$$

**Fig. 1** The eight inference rules for intuitionistic left implication.

investigation was motivated by Schroeder-Heister's argument for a 'conceptually more elementary and more plausible' rule (Schroeder-Heister 2010; 2011) against the backdrop of our own research into the origins of structural reasoning as promoted by Paul Hertz (1922; 1923; 1929) as well as von Plato's recent documentation (2003a) of the development of the sequent calculus from Hertz' sentence calculus. From this historical vantage point both the conventional rule (→L) as well as Schroeder-Heister's alternative rule (→L)◦ are derivable in a purely structural sequent calculus from the *logical ground sequent* → , ⇒ by applications of the cut rule. In such a calculus logical ground sequents are the main way of meaningfully introducing logical symbols into sequents.1

Since this particular logical ground sequent consists of three formulae, any number of which can be used as cut formula(e), there are eight possible rules that can be derived in this manner. Figure 1 presents the intuitionistic variants of those eight rules that will be relevant for this investigation. The intuitionistic restriction being that Λ (where applicable) must not contain more than one formula.2 Each one of the

<sup>1</sup> Weakening and rules with an additive favour can always be used to introduce arbitrary logical symbols.

<sup>2</sup> For each of these, the corresponding classical rule is easily restored by adding (where applicable) Δ<sup>1</sup> to each premise's succedent that is and Δ<sup>2</sup> to each premise's succedent that is → while adding the same to the conclusion, as well as dropping the restriction on Λ.

rules is systematically labelled (→L ), where , , ∈ {0, 1}. The indices , , correspond to the three formulae → , and occurring in the ground sequent, and the number specifes for each of these formulae whether it is used as a cut formula in the derivation of the respective rule or not. To begin with, if = 0 then the formula → *is not* used as a cut formula in the derivation of that rule and, consequently, occurs in the conclusion of that rule as an antecedent formula; on the other hand, if = 1 then the formula → *is* used as a cut formula in the derivation of that rule, which means that it can no longer occur in its conclusion; moreover, the derived rule must have a premise in which → occurs as a succedent formula. In the same manner the value of specifes the occurrences of formula . The value of plays a similar role with regard to the occurrences of formula , except that "antecedent" has to be exchanged by "succedent" in the characterization and vice versa. Thus rule (→L ) has exactly + + premises and 3 − ( + + ) formulae of the ground sequent occur in its conclusion.

For example, consider the derivation of the standard sequent calculus rule (→L011) from the logical ground sequent, which is employed as the logical axiom (lgs). Each of the formulae and of the ground sequent are the cut formulae in subsequent applications of cut. Note that the resulting derived rule is irrespective of the order in which the cuts are applied.

$$\begin{array}{c c} \Gamma\_1 \Rightarrow A & \overline{A \to B, A \Rightarrow B} \\ \hline A \to B, \Gamma\_1 \Rightarrow B & \text{(cut)} \\ \hline \end{array} \begin{array}{c} \text{(lgs)} \\ \text{(cut)} \\ B, \Gamma\_2 \Rightarrow \Lambda \\ \hline \end{array} \begin{array}{c} \text{(lgs)} \\ \hline B, \Gamma\_2 \Rightarrow \Lambda \\ \hline \end{array} \begin{array}{c} \text{(lcut)} \\ \hline \end{array}$$

Rule (→L000) is simply the logical axiom that has the ground sequent itself as its conclusion, since none of its formulae are used as cut formulae. In each of the rules (→L100), (→L010) and (→L001), one of the formulae in the ground sequent is cut, resulting in a single premise rule each, where the premise contains the respective formula in the appropriate position and the conclusion contains the remaining formulae in the positions in which they occur in the ground sequent. Schroeder-Heister's alternative (→L)◦ is the rule (→L010). In each rule (→L011), (→L101) and (→L110), only one of the formulae in the ground sequent is not cut and the other two are the cut formulae of two applications of cut, thereby resulting in a two premise rule. Rule (→L011) is the standard sequent calculus rule. Finally, the rule (→L111) of three premises is the result of cutting all three of the formulae that occur in the ground sequent. All of these rules and their respective properties are thoroughly discussed in the precursor article (Arndt, 2019).

As a notational convenience for the purpose of talking about several rules, the wildcard symbol "∗" for , or admits either value, and we employ the same notation when talking about implication elimination rules later in this article.

It is the purpose of the present investigation to transfer the observations that were made with regard to these sequent calculus rules to the calculus of natural deduction. This immediately poses a number of questions:

1. As natural deduction is fundamentally a calculus based on rules, what kind of object could correspond to the logical ground sequent that was the starting point in our preceding work?


The key to answering these questions lies in two fundamental modifcations of the calculus of natural deduction. Firstly, in an efort to delaminate the logical and the compositional aspects of natural deduction rules, we add a structural rule of *explicit composition* to the calculus that governs the assembly of derivations.3 This rule is related to rules of explicit substitution or delayed substitution that was frst introduced for the -calculus (Abadi, Cardelli, Curien, and Lévy, 1991) and then investigated for its logical properties by, e.g. Espírito Santo (2007) and Pfenning (2007). However, in those cases it was studied for -calculi or sequent-style natural deduction systems with term labels, whereas we will provide a straightforward natural deduction rule in standard notation. We will employ this rule for the purpose of composing (i.e., constructing) derivations ab initio.4 Secondly, we will introduce *proudness markers* that prevent premises thus marked from being the conclusion of any rule application. A suitably marked variant of standard implication elimination will correspond to the logical ground sequent. These two modifcations together form an adequate basis for the translations of the eight rules for left implication into natural deduction.

This article is structured as follows. Section 2 recalls Schroeder-Heister's argument for suggesting his alternative sequent calculus rule (→L)◦ , based on an observation how derivations are composed in natural deduction, thus motivating the introduction of an explicit composition rule. It also addresses the matter of preventing the composition of derivations, which results in the notion of proudness markers. In Section 3 both additions to the calculus of natural deduction are used to distinguish eight rules for implication elimination and to demonstrate their interderivability. The properties of these eight rules with regard to their usefulness are discussed in Sections 4 and 5. Section 6 widens the discussion from a merely technical matter to questions of improving the relationship between Gentzen's calculi as well as the role of directionality in the calculus of natural deduction. Section 7 closes with a summary and a brief outlook.

<sup>3</sup> The name of the rule is taken from one of our earlier articles (Arndt and Tesconi, 2014), where it is used as the single generalized composition principle in a sequent calculus famework.

<sup>4</sup> Explicit composition is thus used in the same specifc manner as the cut rule is used in the precursor article to obtain sequent calculus rules from ground sequents.

Eight Rules for Implication Elimination 243

$$\frac{\Gamma \Rightarrow A \quad \quad B, \land \Rightarrow C}{A \to B, \Gamma, \land \Rightarrow C} \left( \rightarrow \text{L} \right) \qquad\qquad\qquad\qquad \frac{\Gamma \Rightarrow A}{A \to B, \Gamma \Rightarrow B} \left( \rightarrow \text{L} \right)^{\diamond}$$

**Fig. 2** Standard left introduction (→L) and Schroeder-Heister's (→L)◦ .

#### **2 Composing derivations**

Schroeder-Heister's suggestion of a new rule (→L)◦ to replace the standard sequent calculus rule is based on his interpretation of *implications-as-rules* in the calculus of natural deduction (Schroeder-Heister 1984; 2010; 2011). According to this view, an implication → codifes the fact that it is possible to derive the formula from an assumption , which is represented by a new "defned" rule that has as a premise and as a conclusion.5 Thus, the standard implication elimination rule is replaced by a generic schema for rule application that has the implication, which codifes the rule in question, as a parameter:

$$\begin{array}{c c c} A \to B & A\\ \hline B \end{array} \tag{1} \\ \begin{array}{c c c} \\ \\ \\ \end{array} \quad \begin{array}{c c c} \\ \\ \\ \end{array} \quad \begin{array}{c c c} A \to B & \frac{A}{B} \\ \\ \end{array}$$

The schema reads as follows: " can be inferred from under the assumption that there is a rule → that admits this inference." Schroeder-Heister observes that, while this formal reinterpretation of *implications-as-rules* is easy to implement in natural deduction, the standard inference rule of the sequent calculus (→L) that corresponds to implication elimination does not suggest such a reading. While the rule obviously concludes an implication as part of the antecedent, i.e., introduces a hypothetical implication, it is clearly not about utilizing this assumption for the purpose of deriving a formula . This is why Schroeder-Heister suggests a new inference rule (→L)◦ for the sequent calculus, which makes it possible to conclude, on the basis that is the consequence of certain formulae, that formula must be the consequence of those formulae if the additional hypothesis → is introduced at the same time. Figure 2 presents both sequent calculus rules.

There are two ways of looking at the relationship between the rules (→L) and (→L)◦ . In order to see what is taking place inside of (→L), we can derive the rule on the basis of (→L)◦ by means of a subsequent cut on formula :

$$\frac{\Gamma \stackrel{\mathcal{D}\_1}{\Longrightarrow} A}{\frac{\Gamma, \, A \to B \implies B}{\Gamma, \, A \to B \implies B} \stackrel{(\to \text{L})^\circ}{\qquad} B, \,\Delta \Longrightarrow C}{\text{ $\Gamma, \, A \to B, \,\Delta \Longrightarrow C$ }} \text{ (cut)}$$

In other words, if we give primacy to (→L)◦ , then the standard rule is easily derived by performing a cut on the succedent formula in the conclusion of that rule and the hypothetical formula in the antecedent of another sequent. Thus, the consequence

<sup>5</sup> Indeed, Schroeder-Heister uses a considerably more involved natural deduction system that employs the notation ⇒ (not to be mistaken as a sequent arrow) for rules of higher order, from which implications → can be obtained through introduction rules and vice versa.

244 Michael Arndt

D<sup>1</sup> Γ ⇒ (→L)◦ Γ, → ⇒ D<sup>1</sup> Γ ⇒ D<sup>2</sup> Δ, ⇒ (→L) Γ, Δ, → ⇒ { { D<sup>∗</sup> 1 → (1) D<sup>∗</sup> 1 → (2) D<sup>∗</sup> 2 

**Fig. 3** (→L)◦ , (→L) and their translations into natural deduction with rules.

formula features merely as an intermediate result that is subsequently related to a corresponding hypothesis in another sequent. Alternatively, if we give primacy to the standard rule, then (→L)◦ is trivially derivable by choosing a suitable instance of the structural axiom for the right premise:

$$\frac{\mathcal{D}\_1}{\begin{array}{c} \Gamma \Rightarrow A\\ A \to B, \,\Gamma \Rightarrow B \end{array}} \xrightarrow{(\text{id})} \frac{(\text{id})}{(\rightarrow \text{L})} $$

Both analyses convey the same picture, namely that in a certain sense too much is taking place within an application of (→L). Not only does it encode the information that when a formula is given (as the consequence of certain other formulae), the assumption of an implication → allows us to simultaneously proceed to a new consequence ; it furthermore proceeds to use that newly obtained formula in order to create a link to some other consequence relation in which fgures as a hypothesis, something which in itself quite obviously has no immediate connection to an intuitive understanding of implication.6 This second efect can only be prevented by providing as second premise the trivial consequence relation that relates back to itself. It is apparent that a more natural approach is to employ the rule (→L)◦ , which expresses the core feature of implication, and to use the cut rule to explicitly create a link to another consequence relation.7

Figure 3 relates the two sequent calculus rules under discussion to the calculus of natural deduction that additionaly uses a rule-based inference schema.8 What was just observed for the sequent calculus rules is immediately refected in their translations. In (1), the application of a rule in natural deduction merely extends the derivation D<sup>∗</sup> 1

<sup>6</sup> Note that we are talking about implication in an intuitionistic setting. In the classical case, instead of talking of hypotheses and consequences, a semantic argument about the truth values of the various formulae in a sequent immediately establishes (→L) as the ideal rule.

<sup>7</sup> This immediately leads to the issue of non-eliminability of cuts, which, from a technical perspective, is less than desirable. Schroeder-Heister (2011) argues that these cuts are unproblematic, and we systematically discussed the question of analyticity of cuts (Arndt, 2019).

<sup>8</sup> Following the usual conventions for writing down schematic derivations in natural deduction with the purpose of providing more elegant presentations, we omit the parametric sets of hypotheses Γ and Δ in the translation.

of , which exemplifes Schroeder-Heister's paradigm of *implications-as-rules*. On the other hand, in (2) the implication furthermore establishes as a link between two derivations D<sup>∗</sup> 1 and D<sup>∗</sup> 2 , which corresponds to the paradigm of *implications-as-links*.

At this point it is important to be mindful of the fact that natural deduction is fundamentally a unidirectional calculus in which derivations are constructed from the top downward. The prevalent notation for substitution suggests that any derivation can simply be plugged into another one if the conclusion of the former matches an assumption of the latter. Prawitz' assemblage of formal defnitions states clearly that a derivation is constructed inductively from other derivations by applying a rule whose premises match the end formulae of those derivations, thereby combining them into a new, single derivation whose end formula is the conclusion of that rule (Prawitz, 1965). Every rule of natural deduction represents a tree constructor which has one or more derivation trees as its argument(s) and yields a new derivation tree whose immediate subtrees are those arguments, arranged according to where their end formulae occur in the premises of the rule.9 Thus, the operative part of a derivation tree is always its end formula, which is the conclusion of its bottommost rule application. In contrast to this, neither the substitution of derivations for assumptions nor the linking of derivations on certain formulae are operations that are part of the calculus proper. They are meta-theoretical operations that are formally defned by recursively deconstructing a given derivation up to its leafs, some or one of which are then replaced by the derivation that is the substituens, after which the deconstructed derivation is reassembled around the substituted part(s).

Consequently, the diference between (1) and (2) in Figure 3 is a substantial one. The former is merely the application of a rule, i.e., an operation within the calculus (albeit in a somewhat extended calculus), whereas the latter adds to this an operation that, strictly speaking, lies outside of the calculus proper. This observation certainly underscores Schroeder-Heister's case that (1) is philosophically more fundamental than (2).

#### **2.1 Explicit composition**

It is quite easy to restate (2) in a manner that is rule-based. The key is to use a rule that merely states the intention that some derivation is to be substituted for an assumption in another derivation. This corresponds to the explicit substitution of terms into terms in the -calulus (Espírito Santo, 2007; Pfenning, 2007), whose typing rule is closely related to the cut rule. As Figure 4 demonstrates, it is quite straightforward to translate an application of the cut rule into natural deduction by using well-known translation principles without having to perform a substitution. The left premise of cut, in which the cut formula occurs in the succedent, is translated as a premise of the new natural deduction rule that is the end formula of a derivation with assumptions Γ obtained from a translation of D1. The right premise of cut has the cut formula

<sup>9</sup> Of course, discharge must be seen as a side efect of such a tree constructor.

246 Michael Arndt

$$\begin{array}{ccccc} & & & [A]^n\\ \cline{2-3} (\text{cut}) \xleftarrow{\mathcal{D}\_1} & & \mathcal{D}\_1 \qquad & \mathcal{D}\_2^\*\\ \text{(cut)} \xleftarrow{\Gamma \Rightarrow A} & \mathcal{A}, \Delta \Rightarrow C\\ \cline{2-3} (\text{cut}) \end{array} \qquad \begin{array}{ccccc} & & [A]^n\\ \mathcal{D}\_1^\* & & \mathcal{D}\_2^\*\\ \hline \mathcal{D}\_1 & & \mathcal{D}\_2^\*\\ \hline \mathcal{D}\_1 & & \mathcal{D}\_2 \end{array}$$

**Fig. 4** The translation of cut yields explicit composition.

occurring in its antecedent and the succedent is a parametric formula , which is translated as a second premise that is the end formula of a derivation obtained from a translation of D<sup>2</sup> that has as an assumption beside Δ. Corresponding to the conclusion of cut in which the antecedents of the premises are joined and the succedent is the parametric formula , the conclusion of the new rule repeats the parametric of its second premise. However, this is independent of the assumption , which may therefore be discharged. We call the resulting rule for the natural deduction setting *explicit composition*:

$$\frac{[A]}{\frac{A}{C}} \stackrel{[\text{2}]}{\text{(\infty)}}$$

The same rule was employed by Prawitz (2015) under the name "substitution" to delay substitutions resulting from contractions of maximal implications for the purpose of retaining the overall structure of derivations under normalization. Of course, it is possible to provide a contraction step that executes the substitution:

$$
\begin{array}{ccccc}
& & [A]^1 & & & & \mathcal{D}\_1 \\
\mathcal{D}\_1 & & \mathcal{D}\_2 & & \flat & & A \\
\hline A & & C & (\text{ec:1}) & & & \mathcal{D}\_2 \\
\hline & C & & & & C \\
\end{array}
$$

In the standard calculus of natural deduction, a composition of derivation D<sup>1</sup> with end formula and derivation D<sup>2</sup> that has an assumption is only expressible by means of the meta-theoretical substitution operation. In contrast to that, rule (ec) is a proper inference rule of an extended calculus, i.e., a tree constructor, that allows the composition of D<sup>1</sup> and D<sup>2</sup> into a single derivation. Since the substitution occurs in the contractum, the redex itself can also be considered to be a form of "delayed substitution" or "explicit substitution", which is what its sequent-style relative has been called by other authors (Espírito Santo, 2007; Pfenning, 2007).

Considering that (ec) is neither an introduction nor an elimination rule, any derivation that contains one or more applications of (ec) cannot be considered to be in normal form. A maximal formula occurrence is the conclusion of an introduction rule that is also the major premise of an elimination rule, thereby being the result of and thus representing a detour in a derivation, and a redex consists of the corresponding succession of the introduction and the elimination rules. This new rule (ec) is rather extraordinary as it is essentially a self-contained redex. However, the formula is not a maximal occurrence of a formula. Rather, via both of its occurrences, it redirects the derivation D<sup>1</sup> with this end formula to D2, which depends on its assumption.

The idea of a rule that redirects the fow of a derivation from one of its premises to the assumption, upon which another premise depends, is not new. The standard disjunction elimination rule (∨E) uses this feature, the diference being that the redirection is divided upon the assumptions of two derivations. In this rule the disjunction ∨ is eliminated by a case distinction on its subformulae. However, as this is not expressible in the single conclusion setting of natural deduction, it has to be executed indirectly by separately considering two derivations, one depending on and the other on , and bringing them together by having reached the same end formula in both derivations. Thus, can summarily be concluded, thereby eliminating the disjunction, and this occurrence of is no longer dependent on nor on .

$$
\begin{array}{c c c}
 & [A] & [B] \\
\hline A \lor B & C & C \\
\hline & C & \text{(\vee E)} \\
\end{array}
$$

This rule does not relate its major premise to a formula in the conclusion that has to bear any relation to the disjunction or any of its subformulae. Instead, the conclusion of (∨E) is a parametric device to collect (and contract) the side premises. The logical content thus resides in the setting of assumptions, and the only additional efect an application of the rule has is to discharge those assumptions.

The fundamental idea is that of anticipating the possible aftermath of the application of an inference rule before that rule is applied. In other words, knowing that the application of a certain rule has some formula, say , as its conclusion, instead of actually applying the rule and continuing the construction of the subsequent derivation, that subsequent derivation is instead constructed beforehand with as an assumption. The end formula of that derivation, say , is added as an additional premise of the rule in question, and the conclusion of the rule is changed from to . Thus, the modifed rule not only has the proper premises of the original rule, it also has another premise with the sole purpose of anticipating and collecting the aftermath of the application of the original rule.10 This idea, which was conceived by Gentzen for the purpose of allowing for an elimination rule for disjunction in a calculus that only admits single conclusions, was frst suggested by Prawitz (1979), subsequently elaborated by Schroeder-Heister (1984), and later by von Plato (2001) in the form of *general elimination rules*. The explicit substitution rule (ec) epitomizes the principle underlying this idea.

<sup>10</sup> While it is easily possible to add additional premises or to indicate extensive changes in the structure of derivations, due to the emphasis that natural deduction places on the singular conclusions of inference rules and singular end formulae of derivations, this kind of shifting of the conclusion of a rule into an assumption on a premise requires the introduction of a parametric formula . This is due to the way natural deduction implicitly addresses structural matters. The corresponding modifcation in the sequent calculus would simply result in an empty succedent, which could subsequently be weakened into a singleton succedent .

Similarly, the notation [] does not necessarily refer to a singular assumption , but can refer to multiple such assumptions or even none at all. We will not attempt an exegesis of such structural matters at this point.

 <sup>→</sup> (→E) → [] (→gE) 

**Fig. 5** Standard and general elimination rules for implication.

D<sup>1</sup> Γ ⇒ D<sup>2</sup> Δ, ⇒ (→L) Γ, Δ, → ⇒ D<sup>1</sup> Γ ⇒ (→L)◦ Γ, → ⇒ { { → D<sup>∗</sup> 1 [] D<sup>∗</sup> 2 (→gE:) → D<sup>∗</sup> 1 (→E) 

**Fig. 6** (→L)◦ , (→L) and their translations into natural deduction.

#### **2.2 Standard and general elimination rules**

The standard elimination rule and the general elimination rule for implication are presented in Figure 5. The similarity of these rules is rather obvious due to the fact that both of them have premises → and . It is immediately apparent that the instantiation of for in (→gE) yields (→E), as the discharge can be applied to the premise itself, thereby rendering it inert:

$$\frac{A \to B \quad A \quad \begin{bmatrix} B \end{bmatrix}}{B} \left( \to \text{gE} \right)$$

Conversely, the rule (→gE) can be (schematically) derived by applying (ec) to the conclusion of (→E):

$$\begin{array}{c c c} A \to B & A\\ \hline \hline B & (\to \text{E}) \\ \hline \hline \text{C} & \text{C} \end{array} \begin{array}{c} \text{[} \,^\text{B}\text{]} \\ \hline \text{C} & \text{(cc)} \end{array}$$

This is the same phenomenon that we have observed for the relationship between the rules (→L) and (→L)◦ in the sequent calculus, with the discharged premise playing the role of the trivial sequent and (ec) playing the role of the cut rule.

Based on our observation about the composition of derivations in natural deduction, we will tentatively consider (→gE) as the translation of (→L) as well as (→E) as the translation of (→L)◦ , as depicted in Figure 6. It should be noted that this does not counter Schroeder-Heister's view of considering (→L) as representing the paradigm of *implications-as-links*. Rather than the usual linking by means of substitution, we rather utilize linking by explicit composition, in which the hidden conclusion appears as (reference to) the assumption in another subderivation.

This analysis of composition accounts for the diference in the translations of sequent calculus rules (→L010) and (→L011), i.e., how the consequence formula of the logical ground sequent → , ⇒ is treated under translation. In one case the formula is retained in the conclusion of the rule application, in the other case it is mentioned as an assumption in another derivation. In both cases a derivation is extended in the direction of the conclusion of the rule application.

#### **3 Preventing composition of derivations**

At this point one could content oneself with the observation that any one of the four rules (→L∗∗0) translates into (→E), and that any one of the other four rules (→L∗∗1) translates into (→gE), since it is the last digit in the index that determines whether formula occurs in the succedent of the conclusion of these rules, which corresponds to (→E), or in the antecedent of a premise, which corresponds to (→gE). In the usual natural deduction setting these are the two known alternative rules governing the elimination of implication.

The translations depicted in Figure 6 already exhibit the key to a somewhat more fne grained analysis, however. For both rules (→E) and (→gE) have premises → and , but only the premise is mentioned as the end formula of a derivation D<sup>∗</sup> 1 , the translation of a sequent calculus derivation D1, whereas the main premise → *stands proud* as an assumption in the sense of Tennant (1992). We propose that the former, i.e., the usual understanding of premises of natural deduction rules, is the image of the translation of a succedent formula in a premise of a sequent calculus rule, whereas the latter is the image of the translation of an antecedent formula in the conclusion of a sequent calculus rule. In order to give an adequately diferentiated account of the diferent roles of premises → and , we have to fnd a mechanism that awards the power to make premises of natural deduction rules unavailable for the construction (composition) of derivations. It is instructive for this purpose to review how inference rules give rise to derivations in natural deduction.

#### **3.1 Constructing derivations from inference rules**

Gentzen (1935) himself did not provide a procedure for the construction of derivations, instead mererly giving the following characterization:

(3.2) A *proof fgure*, called a *derivation* for short, consists of a number of formulae (at least one), which combine to form inference fgures in the following way: Each formula is the lower formula of at most one inference fgure; each formula (with the exception of exactly one: the *endformula*) is the upper formula of at least one inference fgure; and the system of inference fgures is non circular [. . .].

(3.3) [. . .] A derivation is in *tree form*, if each one of its formulae is upper formula of *at most* one inference fgure. [. . .] We shall have to treat only of derivations in tree form.

Thus, inference rules essentially serve as tools for the local verifcation that a given fgure is indeed a derivation by correlating them to certain premises (upper formulae) and conclusions (lower formulae).

A formal method for the construction of derivations is formulated by Prawitz in his seminal work on natural deduction (Prawitz, 1965). For this purpose, he introduces the noteworthy distinction between *inference rules* and *deduction rules*. The former essentially express the immediate logical relationships between certain (schematic) premises and certain (schematic) conclusions. He accounts forthe additional allowance for discharging certain assumptions that some of the inference rules express by calling those *"improper"* in distinction to the *"proper"* inference rules that only address the relationships between certain premises and a conclusion. What is presumably improper about certain rules is the fact that they express logical relationships in an indirect manner. The immediacy of the proper inference rules is depicted in the usual notation by the fact that the respective formulae are only separated by a horizontal stroke. The premise or premises of a rule are written immediately above this stroke, and the conclusion is recorded immediately below it. On the other hand, the improper inference rules make it necessary to talk about formulae that lie beyond this immediate relation between premises and conclusions. In order to avoid having to refer directly to derivations, Prawitz formulates *deduction rules* for the improper inference rules. Instead of simply relating one or more premise formulae to a conclusion formula, a deduction rule relates one or more general kind of premises, namely pairs of assumptions and a formula dependent on them, to a more general form of conclusion, also a pair of assumptions, suitably obtained from the assumptions of the premises, and a dependent formula. For example, the deduction rule that is obtained for the rule (∨E) is given as ⟨⟨Γ1, ∨ ⟩, ⟨Γ2, ⟩, ⟨Γ3, ⟩/⟨Δ, ⟩⟩11 with the side condition that Δ = Γ<sup>1</sup> ∪ (Γ<sup>2</sup> − {}) ∪ (Γ<sup>3</sup> − {}).12 Finally, based on the proper inference rules and the deduction rules (which replace the improper inference rules for this purpose), Prawitz describes a procedure for the composition of new derivations from given ones, based on their end formulae and the assumptions those depend on. For a proper inference rule this is done by identifying the premise(s) of a suitable instance of the rule with the end formula(e) of the given derivation(s) and producing the conclusion of the rule as new end formula as well as using (the union of) the assumptions as the formulae on which this end formula depends. For the deduction rules, this is done by simultaneously identifying the assumptions of given derivations as well as their end formulae with a premise pair of a suitable instance of the derivation rule, and by producing as conclusion a pair consisting of an appropriately modifed set of assumptions and the new end formula.

We are recalling this procedure in such detail not because it illustrates the efort

$$\frac{\Gamma\_1 \vdash A \lor B \quad \Gamma\_2 \vdash C \quad \Gamma\_3 \vdash C}{\Gamma\_1, \Gamma\_2 - \{A\}, \Gamma\_3 - \{B\} \vdash C} \left( \forall \mathcal{E} \right),$$

<sup>11</sup> For clarity, we use a slightly modifed notation. Prawitz uses another comma to separate the conclusion pair from the premise pairs instead of the slash "/".

<sup>12</sup> It is quite apparent that this deduction rule could with only small modifcations be formulated in a sequent-style natural deduction format as follows:

Eight Rules for Implication Elimination 251

**Fig. 7** Four styles of displaying inference rules.

that goes into explaining how inference rules give rise to deductions. Instead, the curious distinction that Prawitz makes between proper inference rules and deduction rules inhabit an interesting position in the diferent manners in which the inference rules of natural deduction are understood and, consequently, presented with regard to their role in the construction of derivations.

Figure 7 compares four styles of displaying the rules of natural deduction calculi. Style (1) is the notation used by Gentzen to express *inference fgure schemata*. They contain no direct reference to derivations whatsoever. While this style is adopted by Prawitz (who calls them *inference rules*), he subsequently distinguishes between proper and improper inference rules and introduces deduction rules as a more detailed explication of the improper inference rules for the purpose of describing a procedure for constructing derivations. Style (2) is a modifcation of the former that we propose to account for the diference that results from this distinction Prawitz made. His deduction rules are represented in a manner more suitable for a comparison with other styles.13 It is quite remarkable that the main formula ∨ of the derivation rule for the improper inference rule (∨E) is understood in relation to a derivation, whereas this is not the case for the main formula → in the proper inference rule (→E). Style (3), perhaps most prominently used by van Dalen (1980), has become the standard notation for displaying rules of natural deduction. It uses what he calls *derivation*

<sup>13</sup> It must be noted that the dots do not signify derivations yet, but merely the relation between certain assumption formulae and another formula depending on those, and, moreover, the parametric assumptions are not mentioned nor is the manner in which the forming of the union and the subtraction of the explicit assumptions occurs.

*rules* that double as demonstrating immediate inferences as well as addressing matters of assumption discharge that are only properly explicable in the derivational setting. What is rather remarkable in this case is the fact that derivational composition on the "proper" premises, i.e., premises that do not depend on assumptions that may be discharged, is not indicated (although it is implicitly assumed), whereas it is explicitly indicated that premise formulae are the end formulae of two derivations depending on respective assumptions, of which those having certain forms may be discharged. Thus, this style is hardly an improvement on (1), as there it is at least apparent that both the premises with discharge markers and those without have to be explained in terms of how derivations are to be composed. Style (4), the uniform representation of the idea that all of the premises of a deduction rule can fgure as end formulae of derivations, is also used to depict the rules of the calculus. Noteworthy in this regard are the books by Girard, Lafont, and Taylor (1989) and by Troelstra and Schwichtenberg (1996)14. This style of notation is occasionally employed by authors, who use style (1) and (3) for stating the rules of the calculus, when talking about actual applications of the deduction rules in concrete but otherwise unspecifed derivations.

In the literature it is usually assumed to be understood by the reader (or it is occasionally briefy made explicit) how the rules are to be read and how they are to be used for the purpose of constructing derivations. However, in view of the question how the composition of derivations is to be prevented on certain premises, the notational insignifcancy of employing dots to signify the possibility of an occurrence of a derivation on top of premises, acquires some importance. For the purpose of distinguishing which premises are available to continue derivations from those that are not, we would prefer to use style (4), i.e., to write vertical dots above premises to indicate that they can be used to continue a derivation with that end formula, and to purposefully *not* write vertical dots above premises that must stand proud. However, we suspect that, due to the frequent use of style (3) in the presentation of rules and of style (4) for concrete derivations in the literature, this would lead to some confusion.

#### **3.2 Proudness markers**

We thus adopt style (1) when stating inference rules, and we introduce as an additional notational measure a *proudness marker* that can be applied to individual premises as long as they do not also carry discharge information.15 This proudness marker has the form of a dotted line above a premise in an inference rule. It signifes that this premise must not be used to continue a derivation, i.e. that it can only fgure as an assumption in the derivation.

<sup>14</sup> Here, derivations are written as parameters D, D′ , . . . , D1, D2, . . . instead of vertical dots.

<sup>15</sup> If we were to mark a premise with discharge information to stand proud, the premise itself would be the only assumption that could be discharged. Hence, the proudness marker would trivialize every possible application of the rule by either forcing an empty discharge or by having a proud assumption discharged immediately.

Take as an example the inference rule obtained from standard implication elimination by marking the left (major) premise to stand proud:

$$
\begin{array}{c}
\ddot{A} \stackrel{\textstyle \cdots \cdots \cdots \textstyle \cdots}{\longrightarrow} \\
\underline{B}
\end{array}
\begin{array}{c}
\begin{array}{c}
A \\
\left(\rightarrow \to \right) \\
\end{array}
\end{array}
\begin{array}{c}
\begin{array}{c}
\begin{array}{c}
\end{array} \\
\left(\rightarrow \to \right) \\
\end{array}
\end{array}
$$

As this premise can never be the continuation of a derivation, i.e., this occurrence of → is designated as an assumption in any derivation resulting from the application of this rule.

There are two quite diferent ways in which the proudness marker can be understood. The frst interpretation remains frmly within the standard conception of natural deduction. According to this the proudness marker merely expresses the requirement that the premise must be an assumption, i.e., a trivial derivation that is already present as the rule is applied. The second interpretation has it that, given a derivation with end formula , in order to conclude the formula by means of this rule, an ad hoc assumption of the major premise → is made. While this difers from the frst interpretation perhaps only in the sequence of the steps comprising the rule application, it is quite easy to confate this sequence and understand it as having an almost bidirectional quality in the sense that not only does the application of the rule produce a conclusion, but it also introduces a new assumption at the same time.16

It is the second interpretation that grants the extra conceptual space in which suitably diferent implication elimination inference rules can be explained and maintained. The standard implication elimination rule is used to extend two derivations, one with end formula → and one with end formula , to a new derivation with end formula that depends on the union of those assumptions. The rule (→E)′ above can be used to extend a derivation of from assumptions to a derivation of with the additional assumption → . The variant with both premises marked to stand proud can be understood as playing the role of a *ground derivation* that expresses that a new derivation of a formula from two new assumptions → and can be stated.

#### **4 Eight rules and their relationships**

We can now combine the notational facility of restricting the composition on either one of its premises by means of proudness markers together with the power of using explicit composition to preempt the subsequent application of rules to the conclusion of the standard implication elimination rule. As each of these modifcations can be applied to the two premises or the conclusion independently of one another, this results in eight diferent rules. Figure 8 displays all of the possible combinations of assigning proudness markers to the premises → and in both the standard rule and the general elimination rule.

<sup>16</sup> These two interpretations correspond to the diference in the sequent calculus between the implication occurring as both antecedent and succedent in an identity axiom in a premise of a rule (→L1∗∗) versus that formula being introduced as an antecedent formula in the conclusion in a rule (→L0∗∗). The details of this will be developed shortly.

**Fig. 8** The eight inference rules for implication elimination.

#### **4.1 Classifcation**

Each inference rule is systematically labelled (→E ), where , , ∈ {0, 1}. The indices , , correspond to the schematic formulae → , and . These numbers specify for each of these formulae whether it is available for the purpose of composition with other derivations. While → and are the premises of the standard elimination rule, is its conclusion. Thus, "composition" means something quite diferent in these two cases.17

If = 0 then the premise → must stand proud in any application of that rule and is thus not usable for the purpose of continuing derivations with that end formula through this premise. If = 1 then the premise → may be used to continue a derivation that has this implication as its end formula. The value of specifes the role of premise in the same manner. The value of plays a rather more striking role with regard to formula , as it determines the format of the rule. If = 0 then the formula is the conclusion of the two premise standard elimination rule. Since any derivation resulting from an application of this rule has as its end formula and can thus be further attached to subsequent rules applications, this value for does not preclude composition as such in contrast to what the value 0 for and specifes for formulae → and . If = 1 then the formula does not feature as the conclusion of the respective rule but as an assumption to be discharged in the additional premise of the general elimination rule. As mentioned before, when talking about groups of rules, the symbol "∗" for , or can admit either value.

<sup>17</sup> While there seems to be a diference between what "composition" means at this point, we shall see later on that we can render composition in all three cases by explicit composition.

#### **4.2 Understanding the rules**

So far the new rules are little more than the product of a combinatorial exercise. While the juxtaposition of the standard elimination rule versus the general elimination rule has been discussed in the literature with some beneft, the addition of the proudness marker might appear to be something of an ad hoc device. Tennant's result that every derivation in natural deduction can be transformed into a derivation in *exposed normal form* of the same end formula in which all of the major premises of elimination rules stand proud (Tennant, 1992) gives some pertinence to the idea of restricting major premises of the usual inference rules. However, being able to block the minor premise of an elimination rule from being the result of a derivation does not seem to have a meaningful application. We will proceed to consider possible justifcations for the various new rules by elaborating on their diferences.

The new rule (→E000) is the standard elimination rule that requires both of its premises to stand proud. The rule is apparently not particularly useful, since it introduces two assumptions, → and , just for the purpose of obtaining the conclusion . A more efcient way of obtaining is to simply assume itself (unless, of course, some dependency from → or is to be subsequently utilized). Since the rule cannot be used to continue a derivation, the whole fgure can only occur as a topmost application in a derivation. Thus, the rule is nothing more than a static schema relating the assumption of an implication and the assumption of its implicational antecedent to the consequence of its succedent, as expressed by the derivability statement → , ⊢ . If we consider a formal framework for deducing such derivability statements for natural deduction, this would correspond to the following axiom schema that immediately delivers statements of this form:18

$$
\overline{A \to B, A \vdash B}
$$

In contrast to this, rule (→E110), the standard implication elimination rule that can be used to extend derivations through both of its premises, can by no means be reduced to something as static as a consequence statement. It has to be expressed in the manner of a derivation rule that states that, if both → and are derivable from certain assumptions, then , dependent on the union of these assumptions, can be concluded. Thus, rule (→E110) clearly expresses a certain relationship of derivability statements that could be written as a rule of inference for such statements as follows:

$$\frac{\Gamma\_1 \vdash A \to B \qquad \Gamma\_2 \vdash A}{\Gamma\_1, \Gamma\_2 \vdash B}$$

Note that the derivability statement → , ⊢ mentioned above is easily obtained in the special case that both → and are indeed assumptions, (albeit assumptions that are made independently of the subsequent application of the rule):

<sup>18</sup> As we will take the step of relating the left implication rules of the sequent calculus to the new elimination rules shortly, we refrain from introducing labels for these ad hoc inference rules. However, we expressly talk about derivability in natural deduction as opposed to the syntactic notion of sequent, which is why we use "⊢" instead of "⇒" here.

256 Michael Arndt

$$\frac{A \to B \vdash A \to B \qquad A \vdash A}{A \to B, A \vdash B}$$

The new rules (→E010) and (→E100) are positioned between these two rules, because they both continue a derivation through one of their premises while simultaneously introducing a single new assumption. These are the respective inference rules for derivability statements:

$$\frac{\Gamma \vdash A}{A \to B, \Gamma \vdash B} \qquad\qquad\qquad\qquad \frac{\Gamma \vdash A \to B}{A, \Gamma \vdash B}$$

Finally, as we have already discussed, the diference between rules (→E∗∗0), the standard format, and (→E∗∗1), the general elimination format, lies in the fact that the former ends with the conclusion whereas the latter presumes that another derivation with end formula from the assumption is considered for the application of the rule. This consideration of an additional derivation has the efect on the respective inference rule for derivability statements of adding the statement , Γ ⊢ as another premise while replacing the original consequence in the resulting inference rule by the parametric :

$$\begin{array}{ccc}\hline\cdots\cdots\\\hline\ldots\dashv B &\longrightarrow & \begin{array}{ccc}\cdots & B,\Gamma\_{\ell}\vdash C\\\hline\cdots\dashv C\end{array}\\\hline\end{array}$$

The induced inference rules for derivability statements show that a rule (→E ) composes + + derivations into a new derivation. As we have just illustrated, every index that is 1 expresses the power to compose some derivation that is assumed to be given by the application of the rule, whereas every index that is 0 renders the corresponding premise inactive in the case of and , or abstains from preempting the subsequent continuation of the derivation in the form of the general elimination format in the case of . It is thus apparent that from a compositional point of view the diferences between the eight inference rules are certainly noteworthy. It is perhaps unexpected that this is not only the case for the two formats of inference rules that are strikingly diferent, the standard and the general elimination rules. The signifcantly distinct derivational aspects of the implication elimination inference rules which only difer in whether certain premises are marked to stand proud or not are of the same rank as the diference between those two formats, since they determine how many derivations are to be composed through an application of the respective inference rule.

By way of the ad hoc inference rules for derivability statements it has also become obvious how the eight sequent calculus rules of Figure 1 are related to the eight implication elimination rules in Figure 8. Each rule (→L ) corresponds to the rule (→E ) in the sense that (i) there is an appropriate implication elimination rule to be used for the translation of derivations from the sequent calculus into natural deduction, and (ii) the derivational character of each one of the natural deduction rules is described in a sequent-style format by the respective sequent calculus rule.19

<sup>19</sup> Note that the sequent calculus rules are formulated with the usual intuitionistic constraint of |Λ| ≤ 1 for arbitrary succedents, whereas the natural deduction rules use a parametric which may be the absurdity constant. Replacing Λ by and allowing absurdity in the sequent calculus yields the exact correspondence.

Whereas the latter correspondence is well known from the literature in the form of sequent-style natural deduction rules, using (→L110) for standard implication eliminiation and (→L111) for the general elimination rule, the reverse is less prevalent. Indeed, while there are many interesting studies about the translations from the sequent calculus into natural deduction and the role of the cut rule therein (Zucker, 1974; Negri and von Plato, 2001; von Plato, 2003b; Tesconi, 2011), they are all based on the translation of standard sequent calculus rule (→L), i.e., rule (→L011) in our classifcation, to either the standard or the general elimination rule. It is apparent that it is actually rule (→E011), the general implication elimination rule in which the premise → stands proud, that is its proper correspondent, since the standard left implication rule of the sequent calculus introduces the implication into the antecedent of its conclusion.

#### **4.3 Interderiving the rules**

The sequent calculus rules (→L ) have been shown to be interderivable by means of cut in our companion article (Arndt, 2019). We will now demonstrate a similar interderivability of the eight implication elimination rules with explicit composition (ec) playing the role of cut. Just as (→L000) is the rule (axiom) from which all other rules can be derived, the inference rule (→E000), the standard elimination rule with both premises standing proud, lies at the heart of this exercise. This means that we consider as fundamental the standard elimination rule that expresses the immediate relationship between the relevant formulae while being derivationally inactive. All of the rules (→E ) are derivable from this rule by means of explicit composition. By means of (ec) the inactivity of the components of rule (→E000) can be selectively removed in order to obtain any kind of compositionality format required.

This matter is complicated by the fact that we don't want to make the easy assumption that the order of the premises in a rule is irrelevant. The left premise of rule (→E000) is → , and its right premise is . An application of (ec) to the conclusion with regard to composition formula discharges that right premise, but in the resulting derivation the new assumption ends up on the left of the premise → :

$$
\begin{array}{c}
\begin{array}{c}
\dot{A} \stackrel{\cdot \cdot \cdot \cdot \cdot}{\longrightarrow} \dot{B} \end{array} \begin{array}{c}
\dot{\stackrel{\cdot \cdot \cdot}{\cdot}} \stackrel{\cdot}{\stackrel{\cdot}{\cdot}} \stackrel{\cdot}{\stackrel{\cdot}{\cdot}} \\
\stackrel{\cdot}{B} \end{array} (\text{cc.}1)
$$

The rule thus derived is not (→E010), since that rule has the premises occurring in reverse order. The way to remedy this unfortunate complication is to consider a companion rule (ce) to explicit composition20 that has its premises reversed:

<sup>20</sup> Read this rule as "composition explicite" (French) or "composición explícita" (Spanish).

$$\begin{array}{cc} \text{[ $A$ ]} \\ \frac{A \quad C}{C} \text{ (cc)} \end{array} \quad \text{(a)} \\ \begin{array}{c} \text{[ $A$ ]} \\ \frac{C \quad A}{C} \text{ (cc)} \end{array} $$

It is apparent that using this new rule for any composition on formula retains the relative position of the premise with regard to the other (the left) premise → of rule (→E000).

The derivations of all of the rules are shown in Figure 9. They are to be understood as derivation schemas rather than concrete derivatons. This is particularly obvious and relevant in the cases of the rules (→E∗∗1), in which the right premise of the fnal application of (ec) is decorated with the information that assumptions can be discharged. This is simply due to the fact that this premise has never been instantiated in that application of (ec).

Rule (→E000) is listed for the sake of completeness. The remaining three variants of the standard elimination rule (→E∗∗0) are obtained by applying one or both versions of explicit composition to the conclusion in (→E000). For the rule (→E010), the second premise of (ce) is an arbitrary assumption , and, consequently, the premise of (→E000) can be discharged. Similarly, for rules (→E1∗0), composition rule (ec) is applied to an arbitrary assumption → , and the corresponding implication as premise of (→E000) can be discharged. Alternatively to the manner it is derived in Figure 9, rule (→E110) can be obtained by applying the variants of explicit composition in reverse order, i.e., by frst applying (ce), thereby addressing , and then applying (ec), which takes care of → .

In the cases of rules (→E 1) that have the general elimination format it is easy to see that they can be derived from the respective rule (→E 0) already discussed by a fnal application of (ec). As this application of (ec) introduces the premise with discharge decoration [] as a right premise, it cannot be exchanged with the application of (ce) that introduces a new assumption on the right hand side, as this would result in an inference rule that has with discharge decoration [] as its second and as its third premise, which is not the usual format for the general implication elimination rule. Thus, in the derivations of rules (→E∗11), the application of (ce) with regard to must always come before the application of (ec) with regard to . In contrast to this, the application of (ec) with regard to → can occur before or after the application of (ec) with regard to .

The various ways of deriving the rules (→E∗∗∗) from (→E000) are shown in Figure 10. Each arrow corresponds to an application of (ec) in the cases with the labels → or or to an application of (ce) in the cases with the label . Notable are the omissions of arrows from (→E001) to (→E011) and from (→E101) to (→E111) refecting the fact that an application of (ce) as composition on after the application of (ec) as composition on would result in a general elimination rule in which the second and third premises are reversed.21 On the other hand, the subdiagram for rules (→E∗∗0) commutes, since the application of (ec) to premise → and the

<sup>21</sup> It should be noted that it is possible to force the issue and derive (→E111) from (→E101) by extraordinarily applying (ec) instead of (ce) with regard to followed by applying (ec) with regard to → . However, this tweak cannot be used to derive (→E011) from (→E001), as in both of these → must stand proud.

(→E000): . . . . . . → . . . (→E000) (→E010): . . . . . . → . . . . [] 1 (→E000) (ce:1) (→E100): → . . . . . . . . . [ → ] 1 . . . (→E000) (ec:1) (→E110): → . . . . . . . . . [ → ] 1 . . . . [] 2 (→E000) (ec:1) (ce:2) (→E001): . . . . . . → . . . (→E000) [] 1 (ec:1) (→E011): . . . . . . → . . . . [] 1 (→E000) (ce:1) [] 2 (ec:2) (→E101): → . . . . . . . . . [ → ] 1 . . . (→E000) (ec:1) [] 2 (ec:2) (→E111): → . . . . . . . . . [ → ] 1 . . . . [] 2 (→E000) (ec:1) (ce:2) [] 3 (ec:3) 

**Fig. 9** Deriving the eight rules from (→E000).

**Fig. 10** Derivability of the rules by (ec) and (ce).

application of (ce) to premise do not interfere with each other and can thus be transposed. To illustrate this, compare the derivation given for (→E110) in Figure 9 to the following one:

$$\begin{array}{c} \begin{array}{c} \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdots \cdot \cdot \cdot \cdot \cdot \stackrel{\cdot}{\cdot} \cdot \stackrel{\cdot}{\cdot} \cdot \stackrel{\cdot}{\cdot} \\ \hline \end{array} \\ \begin{array}{c} \begin{array}{c} \stackrel{B}{A} \to \stackrel{B}{B} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \stackrel{\cdot}{\cdot} \cdot \cdot \stackrel{\cdot}{\cdot} \cdot \stackrel{\cdot}{\cdot} \\ \hline \end{array} \end{array} \begin{array}{c} \begin{array}{c} \left(\stackrel{B}{A}\right)^{\stackrel{\cdot}{\cdot}} \\ \hline \end{array} \begin{array}{c} \begin{array}{c} \left(\stackrel{B}{B}\right)^{\stackrel{\cdot}{\cdot}} \\ \hline \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(c c c)} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(c c c)} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(c c c)} \end{array} \end{array} \end{array}$$

Furthermore, the subdiagram for rules (→E∗0∗) commutes, as does the subdiagram for rules (→E∗1∗), as the two applications of (ec), one with regard to premise → and the other with regard to the conclusion , can be transposed without afecting the order of the premises. Compare (as the most involved example) the following derivation to the one provided for (→E111) in Figure 9:

$$\begin{array}{c c c} \cdot \xrightarrow{\cdot \cdot \cdot \cdot \cdot \cdots \cdot \cdot \cdot} & \cdot \cdots \\ \hline \hline \begin{array}{c} \xrightarrow{\cdot B} \xrightarrow{\cdot B} \end{array} \begin{array}{c} \xrightarrow{\cdot \cdot \cdot} \xrightarrow{\cdot} \\ \xrightarrow{\cdot B} \end{array} \begin{array}{c} \left(\rightarrow \text{E}\_{000}\right) \\ \xrightarrow{\cdot B} \text{ (cc:1)} \end{array} \begin{array}{c} \begin{array}{c} \left[\begin{array}{c} \left[\begin{array}{c} \left(\text{B}\right)\right] \end{array} \right. \\ \hline \end{array} \text{(cc:2)} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \left[\begin{array}{c} \left(\text{B}\right)\right] \end{array} \end{array} \begin{array}{c} \left[\begin{array}{c} \left(\text{B}\right)\right] \end{array} \end{array} \right.$$

Note that after the transposition of the two applications of (ec) the conclusion of the composition on → is no longer but the parametric formula .

#### **5 Employing the new rules in the calculus**

Having established the uniqueness and interderivability of the eight rules for implication elimination (two of which being the standard and the general elimination rules) by means of explicit composition, it is necessary to discuss the usefulness and

$$\begin{array}{cc} \frac{[(p \land q) \to r]^3}{r} & \frac{[p]^2}{(p \land q)} \text{ (\land I)}\\ \frac{r}{q \to r} & (\to \text{I:1})\\ \frac{r \to (q \to r)}{p \to (q \to r)} \left(\xrightarrow{\left(\text{I:2}\right)}\right) (\to \text{I:3})\\ \end{array}$$

**Fig. 11** A formula derivable with (→E110) but not with (→E000).

suitability of these rules both as alternatives to the known rules as well as in relation to one another for the purpose of generating derivations and proofs in the calculus.

The fact that all of the rules can be obtained from (→E000) by means of the rules (ec) and (ce) apparently informs us that such a discussion is only meaningful in a calculus in which those composition rules are not available. For this reason we begin by considering independently the efect of replacing the standard rule for implication elimination with one of the other eight rules. Following these considerations we will propose a calculus of compositional natural deduction in which only the *logical ground inference rule* (→E000) is used together with the composition rules.22

#### **5.1 One rule replacing standard implication elimination**

The relationship between the standard rules and the general elimination rules in natural deduction has been thoroughly investigated (Schroeder-Heister 1982; 2010; von Plato, 2001; Read, 2010). Specifcally, it has been argued that the general elimination rules are the ideal format for deliberations on the meaning of the logical constants, as they can be thought of as being immediately justifed by the corresponding introduction rules by means of an inversion principle.23 One important observation in connection with these investigations is the fact that a calculus in which the standard elimination rule (→E110) is replaced by the general elimination rule (→E111) proves exactly the same formulae as the standard calculus.

In contrast to this, it is quite apparent that neither one of the rules (→E∗0∗) is a suitable replacement for the standard elimination rule. Figure 11 shows a derivation of the formula ( ( ∧ ) → ) → ( → ( → )) in which the minor premise of the standard implication elimination is the conjunction ∧ , the conclusion of an application of the rule (∧I). It is inconceivable how its subformulae, and , could be made available for the purpose of deriving the formula → ( → ) when only a rule (→E∗0∗) is available that requires its minor premise to stand proud. Consequently,

<sup>22</sup> For this purpose, we will initially look at minimal logic, but will argue that inference rules for other logical connectives can also be replaced by logical ground inferences, thereby obtaining a uniform format for logical inference rules.

<sup>23</sup> Unfortunately, this immediacy is not evident in the case of implication, at least not without resorting to a higher level calculus (Schroeder-Heister, 1982).

$$\begin{array}{cc} \underbrace{\begin{bmatrix} p \rightarrow (q \rightarrow r) \end{bmatrix}^{3}}\_{q \rightarrow r} & \begin{bmatrix} p \end{bmatrix}^{1} \begin{array}{c} \left( \rightarrow \to \text{E}\_{110} \right) \end{array} \begin{array}{c} \left( \rightarrow \text{E}\_{110} \right) \end{array} \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( \end{array} \right) \end{array} \right) \end{array} \right) \end{array} \right) \end{array} \right) \right)} \right)} \right) \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right)} \right) \right)} \right) \right)} \right) \right)} \right) \right) \rightrightarrow \right) \rightrightarrow \right) \left( \right) \left( \right) \left( \right) \left( \right) \left( \right) \right) \left( \right) \left( \right) \left( \right) \right) \left( \right) \left( \right) \right) \left( \right) \left( \right) \left( \right) \right) \left( \right) \left( \right) \right)$$

**Fig. 12** A formula derivable with (→E110) but not with (→E010).

[ → ( → ) ]<sup>5</sup> [ ] 3 [ → ] 2 [] 4 [ ] 1 (→E011:1) (→E011:2) (→I:3) ( → ) (→I:4) → ( → ) (→I:5) ( → ( → ) ) → ( → ( → ) )

**Fig. 13** Deriving the formula with (→E011).

any calculus that replaces the standard rule by either one of the four rules (→E∗0∗) must be incomplete.

Only two more rules, namely (→E010) and (→E011), have to be considered. These are the two rules in which the major premise but not the minor premise must stand proud, where the former has the standard format and the latter the general elimination format. The challenge for rules (→E0∗∗) is the successive decomposition of implications that are nested to the right.

Figure 12 demonstrates that this is easily accomplished by means of the standard rule, in which the major premise may be a derived formula. Successive applications of that elimination rule eventually retrieve the succedent of the innermost implication. Clearly, this cannot be accomplished by rule (→E010). While a frst application of this rule to the proud major premise → ( → ) yields the formula → , this in turn no longer stands proud and can thus not be used as premise for a second application. On the other hand, while an application of this rule to a proud premise → yields the required formula , it is not feasable to combine these two derivations into a single one. Consequently, the end formula of Figure 12 cannot be derived by means of rule (→E010), thereby rendering a calculus using this rule of implication elimination incomplete.

On the other hand, the general elimination variant (→E011) of this rule does allow for the composition of these two subderivations via the third premise. Figure 13 demonstrates that the consequence formula of the frst application of this rule becomes the parametric formula in the second application that really composes the two subderivations by referencing formula → as an assumption. This simple procedure can be repeated as required to decompose arbitrarily deep nestings of

implications.24 Indeed, this observation is a consequence of Tennant's more general result mentioned earlier that all derivations in a natural deduction calculus that employs general elimination rules25, in which all major premises must stand proud for all connectives, are already in normal form.

To summarize, only three of the eight rules for implication elimination are suitable for the purpose of a complete calculus, only one of which is new. Apart from the standard elimination rule (→E110) and the general elimination rule (→E111), only the general elimination rule (→E011) with proud major premise retains completeness. Any choice of the former two does not restrict compositionality of the premises. Quite to the contrary, the general elimination rule even allows for the pre-empting of subsequent rules applications by enabling the composition of a third derivation that depends on the formula which is the conclusion of the standard rule. Interestingly enough, it is this general feature that renders completeness possible in the case of the proud rule (→E011). In contrast to this enabling feature of the general format, proudness markers in general signifcantly restrict the usefulness of the respective rules. Most signifcantly, the requirement of a proud minor premise prohibits any derivation of a complex implicational antecedent, which leaves the four rules (→E∗0∗) so signifcantly impaired as to practically rendering them useless. The requirement of a proud major premise without the redeeming feature of the general rule format is similarly disabling, thereby rendering rule (→E010) useless. It is to be expected that restricting the built-in means of composing derivations by means of proudness markers should have a debilitating efect on the calculus employing the respective rule for the elimination of implication. For this reason the positive result for the rule (→E011) is what stands out in this review. We will return to this important observation and expand on it in the following discussion.

#### **5.2 A calculus for logical ground inferences and composition**

The situation changes signifcantly if rules for explicit composition, (ec) as well as (ce), are available in the calculus. Since all of the rules (→E∗∗∗) are derivable from (→E000) alone by means of explicit compositions, this rule in itself codifes all pertinent logical relationships between the relevant (schematic) formulae for the delineation how implication is to be used. Consequently, in such a setting the matter of composing derivations can be regarded completely independently from the purely logical considerations. As far as implication elimination is concerned, this results in *logical ground inferences*, i.e., "seedling" instances of (→E000), that can subsequently

<sup>24</sup> This procedure cannot be used to circumvent the problem encountered with rules in which the minor premise must stand proud, since the general elimination format adds a third premise that is via its assumption related to the conclusion of the standard rule (the succedent of the implication), not its minor premise (the antecedent of the implication).

<sup>25</sup> Tennant calls the general elimination format "parallel forms".

be composed with other derivations by means of explicit composition.26 That is to say that the elimination of implication is never applied as a rule to conclusions of derivations but always occurs as an instance of (→E000) with premises standing proud. It is instructive to become aware of the fact that the application of a rule consists of two acts that can actually be viewed as independent of one another. This amounts to nothing less than the distinction between the logical aspect of the rule on the one hand, that which is expressed by the logical ground inference, and its structural or procedural aspect, which allows the integration of said logical aspect into a continuing fow of other inferences. Such a view is something of a contrast to the usual manner in which presentations of the calculus of natural deduction cast reasoning itself as progressions of eliminations and introductions of logical symbols. Thus, the suggested use of logical ground inferences together with explicit composition could serve as a starting point for a systematic fne-grained analysis of the notion of reasoning in natural deduction.

Unfortunately, the generalization of this approach to the entire calculus of natural deduction meets the substantial difculty that not all logical inference rules can be reduced to logical ground inferences. We have concerned ourselves only with implication elimination, and a similar analysis can be made for most of the inference rules of natural deduction. The notable exceptions are the elimination rules for disjunction and existential quantifcation as well as the introduction rule of implication and the classical absurdity rule. The two elimination rules are already in the general elimination format, which quite apparently cannot be undone.27 While that still leaves the possibility to add proudness markers to the major premises of those rules, the idea of an undiferentiated generality has thereby been shown to be impossible. Implication introduction only has a single premise that allows the discharge of an assumption. Even if we were to allow a proudness marker for premises that allow for discharge, the only implications derivable would be trivial tautological instances. Moreover, while (ec) can be used on the conclusion of implication introduction, it merely results in a general introduction rule, something we have not considered in the context of this investigation at all.28 The classical absurdity rule has a format quite similar to that of implication introduction. The premise of that rule must be the absurdity constant, whereas the application of the rule allows the discharge of a negated formula.

<sup>26</sup> This situation is similar to the one we described for the sequent calculus that uses the logical ground sequents (Arndt, 2019). The implication left rule can be replaced by the ground sequent → , ⇒ that expresses how implication is to be used. The cut rule must be used to relate such statements of logical relationships to other sequents. Thus, purely logical relationships are expressed horizontally, i.e. within logical ground sequents, whereas matters of reasoning are encoded in the structural rules of the calculus.

<sup>27</sup> In the case of disjunction elimination, one would have to introduce a dual conclusion logical ground inference, and in the case of existential elimination, a logical ground inference would violate the Eigenvariable condition.

<sup>28</sup> What is remarkable in connection with implication introduction, however, is the fact that it is closely related to explicit composition. Consider the following two contractions, which proceed from a maximal occurrence of an implication in the derivation on the left to its usual contractum in the derivation on the right:

Therefore, demanding absurdity to stand proud renders this rule a severely restricted variant of the intuitionistic absurdity rule.

Moreover, there are several useful properties of the natural deduction calculus that are countered or at least made more complicated by an approach that delaminates the inference rules. For one, a single application of standard implication elimination has to be simulated by one instance of (→E000) and one application of each (ec) and (ce), so derivations of the same end formula obviously involve more rules applications. More serious, however, is the fact that maximal occurrences of implications are no longer easily identifable, since the major premise in (→E000) stands proud. Maximality becomes a relational property between the simple premise of explicit composition and an assumption that its other premise depends on. Consequently, the meta theory of normalizability becomes considerably more involved.29

#### **6 Discussion**

The professed aim of this article was to give an exhaustive list of all the inference rules that can be considered to express the idea of eliminating an implication in the calculus of natural deduction. The starting point was a previous comparison of eight variants of inference rules for implication in the antecedent in the sequent calculus. Despite the fact that the only alternative to the standard implication elimination rule that has been seriously discussed in the literature is the general elimination rule, the guiding idea was that it should somehow be possible to formulate eight distinct implication elimination inference rules in accordance to the left rules obtained in the sequent calculus.

Prawitz (1965) observed that the (intuitionistic) sequent calculus can be understood

$$\begin{array}{ccccc} [A]^{\mathsf{I}} & & & [A]^{\mathsf{I}} & & & \mathfrak{D}\_{2} \\ \mathcal{D}\_{\mathsf{I}} & & & [A]^{\mathsf{I}} & & & \mathcal{D}\_{2} \\ \frac{\mathcal{C}}{A \to \mathsf{C}} \stackrel{\scriptstyle(\rightarrow\mathsf{I}\text{-}\mathsf{I})}{\end{array} \\ \begin{array}{ccccc} & & [A]^{\mathsf{I}} & & & \mathcal{D}\_{2} \\ \mathcal{D}\_{\mathsf{I}} & & \mathcal{D}\_{\mathsf{I}} & & \mathcal{A} \\ \frac{\mathcal{C}}{\mathcal{C}} & & \mathcal{C} & \mathcal{A} \\ & & & \mathcal{C} & \mathcal{C} \end{array}$$

We already demonstrated on page 246 that the derivation on the right is the contractum of an application of (ec), and it is naturally also the contractum of (ce). Thus, while an introduced implication → expresses the fact that it is possible to derive from the assumption(s) , explicit composition further states that if, in addition to that derivation, is given independently, then can be reobtained. In other words, while (→I) codifes (certain aspects of) a derivation into a formula, explicit composition uses that derivation and a single instance of its discharged assumption(s) to move on to its conclusion. Formally, the premise of (ec) becomes the antecedent subformula of the implication in (→I):

$$\begin{array}{c} \begin{array}{c} \text{[ $A$ ]}\\ \text{A} \end{array} \\ \begin{array}{c} \text{(\text{e}c)}\\ \text{(\text{e}c)} \end{array} \end{array} \quad \text{(\text{a}c)}\\ \begin{array}{c} \text{[ $A$ ]}\\ \begin{array}{c} \text{(\text{a}c)}\\ \text{(\text{a}c)} \end{array} \end{array} \quad \text{(\text{a}c)}$$

29 This is not necessarily a disqualifying feature. The theory of -calculi with explicit substitution has become a prolifc feld because of, not despite of, these rather fruitful complexities.

266 Michael Arndt

$$\begin{array}{ccccc} \mathcal{O}\_{1} & \mathcal{O}\_{2} & & \Gamma & [B]^{n}, \Delta\\ \Gamma \Rightarrow A & \mathcal{B}, \Delta \Rightarrow C & & & \mathcal{D}\_{1}^{\*} & \mathcal{D}\_{2}^{\*}\\ \hline A \to B, \Gamma, \Delta \Rightarrow C & (\rightharpoonup \mathcal{L}) & & \sim & \frac{A \to B \quad A \quad \mathcal{C}}{C} \; (\rightharpoonup \mathcal{E} \text{\textquotedbl{}} \text{\textquotedbl{}}) \end{array}$$

**Fig. 14** The translation of (→L) into natural deduction.

as a meta-calculus for natural deduction in the sense that its rules can be read as derivability statements for deductions. Every premise in a sequent calculus rule corresponds to some natural deduction derivation of its succedent formula from the assumptions given in its antecedent. The application of the rule in natural deduction combines those given derivations into one corresponding to the conclusion of the sequent calculus rule. The new end formula must thus be the succedent formula, and the assumptions of the new derivation are simply the combination of the assumptions of the given derivations. If any specifc formula mentioned in the antecedent of a premise does not reoccur in the antecedent of the conclusion, this assumption must be marked as discharged in the resulting natural deduction derivation.30 Remarkably, this naive characterization does not cover the correspondence between the main formulae of the left rules and the main formulae of elimination rules. Since such a formula does not occur in a premise of the sequent calculus rule but is instead introduced into the antecedent of the conclusion, a corresponding derivation in natural deduction must introduce that formula as an additional assumption that stands proud.

The translation for a schematic application of rule (→L) based on this interpretation is depicted in Figure 14. The key to an accurate correspondence between the rules in the sequent calculus and rules in natural deduction must thus be to take the diferent treatments of main formula and side formula seriously. The former yields a new assumption, whereas the latter results in an ordinary premise that can be used to continue a given derivation with that end formula. For the rule (→L) the main formula is → and the relevant side formula is . Both correspond to premises in the natural deduction rule, but the former must be marked to stand proud in order to guarantee that it must be an assumption of the derivation. Thus, the natural deduction rule that is obtained by translating the standard left implication rule of the sequent calculus in this manner is rule (→E011):

$$
\begin{array}{ccc}
\begin{array}{c}
\cdot \cdot \cdot \cdot \cdot \cdot \cdot \\
\cdot \xrightarrow{} \\
\end{array} & \begin{array}{c}
\begin{array}{c}
[B] \\
A \\
\cdot \end{array} \\
\end{array}
\xrightarrow{} \begin{array}{c}
[B] \\
\cdot \xrightarrow{} \\
\cdot \xrightarrow{} \\
\end{array}
\begin{array}{c}
[B] \\
(\rightarrow \text{E}\_{011}) \\
\end{array}
$$

As we discussed in the previous section, replacing the standard implication elimination with most of the new rules would render the calculus incomplete. This would be a

<sup>30</sup> An exact correspondence would have to address various structural issues, such as whether antecedents should rather be considered as sequences (without an exchange rule) than as multisets, and whether the mentioning of a formula in the antecedent of a premise refers to a single formula occurrence or to several such occurrences that are to be removed in the conclusion rather than by the application of an independent contraction rule.

rather disappointing result, were it not for the exception of this rule (→E011), which deserves to be called *proud implication elimination*. Taking this rule seriously grants the opportunity to readdress several issues regarding the calculus of natural deduction.

#### **6.1 Improving the relationship between Gentzen's calculi**

Due to the manner in which this new rule was obtained, a natural deduction calculus using proud implication elimination exhibits an intimate correspondence to the sequent calculus without the cut rule (as far as minimal implication logic is concerned). This apparent fact holds the key for a fruitful reconsideration of the relationship between Gentzen's two calculi.

Translations between the calculi have been unwieldy from the very beginning (Gentzen, 1935), and the exact nature of the complications have been discussed in textbooks and specialised treatises (Negri and von Plato, 2001; von Plato, 2003b). The difculties preventing a one-to-one translation in as far as they pertain to the meta theories of the two calculi have been succinctly characterized by Tesconi (2011):

[One translation] maps normal to non cut-free derivations, because elimination rules are translated to left-side rules with the help of the cut rule; [. . .] whereas it is necessarily the case that [the converse translation] maps cut-free on normal derivations, it does not necessarily map cuts on instances of non-normality, because composition – by means of which the cut rule is translated – of normal derivations may or may not preserve normality.

Both of these problems are due to the fact that major premises of elimination rules in natural deductions can be arbitrarily derived formulae, i.e., that derivations can pass through such premises. Consequently, the complexity (the number of logical constants) of formulae encountered along a branch passing downward through major premises can increase or decrease. In contrast to this, since all of the logical rules of the sequent calculus are introduction rules, the complexity of formulae can only increase through their application. The elimination of a major premise must therefore be mimicked by the introduction of that formula into the antecedent of a sequent. The fact that this very premise may be obtained as a conclusion of a subderivation in the natural deduction derivation can only be rendered in the sequent calculus by a subsequent cut of that formula with a sequent that corresponds to that subderivation in which the formula appears in the succedent.

The fact that the structural rule of cut, which is the object of the sequent calculus' meta theory, is required to render the utterly basic act of applying an elimination rule to build up a derivation in the calculus of natural deduction is astounding. Consequently, the meta theories of the two calculi are egregiously misaligned. The *Hauptsatz* of the sequent calculus revolves around the fact that in any derivation all instances of the cut rule, a proper rule of the calculus, can be eliminated. On the other hand, normalization considers the removal of maximal formula occurrences by means of substitution, a procedure that is a meta operation on derivations.

In a natural deduction calculus with proud implication elimination, the continuation of derivations through major premises is impossible. This restriction prevents the 268 Michael Arndt

$$
\begin{array}{ccccccc}
& & & & [B]^n & & & [B]^n & & & \\
& & [B]^n & & & & \cdot & \cdot & \cdots & \cdot & \cdot \\
\mathcal{D}\_1 & \mathcal{D}\_2 & \mathcal{D}\_3 & & & \cdot & \cdot & \cdot & \cdot \\
\hline
A \to B & A & & \mathcal{C} & \leftarrow & \text{ $\mathcal{D}\_1$ } & & \mathcal{D}\_1 & & \overline{A \to B} & \overline{A} \\
& \mathcal{C} & & & & \mathcal{C} & \mathcal{C} & & \leftarrow & \text{ $\mathcal{C}$ } & \\
\end{array}
$$

**Fig. 15** Recovering (→gE) by means of (ec).

formation of maximal implicational formula occurrences, and, in the case of minimal implicational logic, it guarantees normality of derivations.31 Nonetheless, a rich meta theory is still feasible. What is required is the addition of the rules of explicit substitution (ec) and its variant (ce) to the calculus, which take on the roles of enabling redirections that can circumvent the proudness requirement. Figure 15 reiterates how an application of the general elimination rule can be rendered by an application of rule (→E011), which is followed by an explicit substitution of the proud premise → .

While an in-depth discussion of the impact that these modifcations have on the translations between natural deduction and the sequent calculus is a task left for another place, it should already be apparent from this presentation that they can certainly help to signifcantly improve the relationship between the two calculi. What is also apparent is that such a modifcation rather changes the character of the calculus of natural deduction. Having been the showcase of the inferentialist theory of meaning, the addition of a rule for explicit composition could be downplayed as a merely technical matter that changes the meta theory with the efect of somewhat dulling the shine of the calculus in view of that theory.

#### **6.2 Bidirectional natural deduction**

Natural deduction with its emphasis on the relationship between premises and conclusions (the latter of which can be rendered as succedent formulae in a sequent style notation) is a fundamentally asymmetrical calculus. While assumptions are explicitly stated, they retreat into the background as ever new applications of rules remove them further and further from the current end formula (the conclusion of the last rule application), and the only possible purpose to revisit any of them is to fnally discharge them. The most signifcant conceptual divide between the two Gentzen calculi lies in the fact that, in stark contrast to natural deduction, the sequent calculus is symmetrical in that both antecedent and succedent are fundamentally treated in the same manner.32 Indeed, to achieve the suppositiously "natural" relationship between

<sup>31</sup> According to Tennant's result, systematically using proud elimination rules for the other logical constants guarantees universal normality of derivations.

<sup>32</sup> At frst glance the conceptual divide appears to be the fact that in the sequent calculus the (horizontal) direction relating hypotheses to consequence(s) is decoupled from the (vertical)

any numbner of assumptions and a single assertive conclusion in the sequent calculus, its sequents have to be artifcially restricted to singleton succedents.

When considering a possible evolution of calculi from the entirely assertion based Frege-Hilbert style calculus33 towards calculi which give increasingly more signifcance to the role of assumptions, natural deduction is clearly only the frst step. For, while it allows derivations to be dependent on assumptions and implicitly develops consequence relations between assumptions and the current end formula, the assumptions remain unaltered until discharged. Clearly, this is the exercise of *assertive reasoning* on the basis of initial assumptions. The full capacity of reasoning, including *assumptive reasoning*, in which an inference step can consider some (momentarily) static conclusion in relation to changing assumptions, is only realized in the sequent calculus.

This touches upon a matter that has been discussed by Schroeder-Heister. In his discussion of a proper basis of proof-theoretic semantics (Schroeder-Heister, 2009), he describes four features that are taken as integral to the standard paradigm of proof-theoretic semantics:


About these he states:

These four features are intimately connected to the model of natural deduction as its formal background. This holds especially for () and (), which specify () and (), respectively.

Schroeder-Heister criticises these commitments that are imposed simply by the formal framework of natural deduction, which he calls its *unidirectionality*. His suggestion of an alternative paradigm is based on his preference of local reasoning over global reasoning. Seeing that realized in the sequent calculus, he proceeds to formulate a natural deduction rule for implication elimination that embraces and introduces into natural deduction the *bidirectionality* that is inherent in the sequent calculus:

direction of reasoning, whereas the two are aligned in natural deduction. The sequent calculus explicitly derives sequents, i.e., syntactic representations of consequence relations between sets of formulae. On the other hand, the rules of natural deduction appear to be formulated for the purpose of deriving individual formulae, since every rule of the calculus locally relates one or more premises to a conclusion. However, most rules have the signifcant side efect of modifying the global dependency of that conclusion on assumptions, either by means of allowing the discharge of certain assumptions or by means of joining the assumptions on which the premises depended. Thus, every new conclusion obtained by the application of an inference rule has to be seen in relation to the collection (usually in a set or multiset) of still undischarged assumptions. Indeed, it is easy to record the dependencies of conclusions on assumptions explicitly in a sequent-style manner, which most notably results in redundant accounting of assumptions. Consequently, a derivation in natural deduction is simply a highly compactifed representation of onion layers of consequence relationships.

<sup>33</sup> The deduction theorem, which states that Γ, ⊢ if Γ ⊢ → and only through this admits derivations based on assumptions, constitutes the meta theory of those calculi.

#### 270 Michael Arndt

$$\begin{array}{c c c c} & & \begin{array}{c} \Lambda \\ \end{array} & \begin{array}{c} \left[C\right]^{n}, \Sigma \\ \end{array} \\ \begin{array}{c} \left[B\right]^{\*} \\ \end{array} & \left[B\to C\right]^{1} \\ \hline \end{array} & \begin{array}{c} \left[B\to C\right]^{1} \\ \end{array} & \begin{array}{c} \left[B\right]^{\*} \\ \end{array} & \begin{array}{c} \left[C\right]^{\*}, \Sigma \\ \end{array} \\ \begin{array}{c} \left[B\right]^{\*} \\ \end{array} & \left[B\right]^{\*} \left(\rightarrow \text{E}\_{011} : n\right) \\ \end{array} \end{array}$$

**Fig. 16** Two successive applications of (→E011).

$$
\begin{array}{c}
\begin{array}{c}
(n) \\
\hline
\end{array} \\
\begin{array}{c}
(n) \\
\hline
\end{array} \\
\begin{array}{c}
(n) \\
\hline
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
(n) \\
(n) \\
\end{array} \\
\begin{array}{c}
(n) \\
(n) \\
(n) \\
(n) \\
(n) \\
(n) \\
(n) \\
\end{array} \\
\end{array}
$$

The bar above the major premise expresses "the crucial modifcation that the major premiss must now be an assumption, i.e., must occur in top position". Thus, Schroeder-Heister had already singled out proud implication elimination, the rule that is (→E011) in our terminology, as the philosophically most promising alternative to the standard rule for the purpose of developing a more general approach to proof theoretic semantics.

It should be noted, however, that "bidirectionality" must not be understood in the literal sense of directly applying rules upwards, i.e., by drawing an inference stroke above some undischarged assumption and placing the relevant premises as new assumptions on top of it. Instead, such an assumption must be discharged by a regular (downwards) application of the respective proud inference rule, thereby utilizing its capability to preempt and explicitly compose a derivation that would ordinarily follow the application of the corresponding standard rule. Figure 16 depicts two successive applications of proud implication elimination. It is apparent that the major premise → of the frst rule application remains in topmost position even after the second application of the same rule, although it gets discharged in the process. Nonetheless, by means of this discharge and the positioning of a new major premise the second application efectively replaces the previous assumption → by the new assumption → ( → ).34

#### **7 Summary and outlook**

We had set ourselves the somewhat technical task of translating the eight alternative rules for left implication in the sequent calculus into natural deduction. The fact that natural deduction does not encompass a cut rule, which is used to derive those alternatives in the sequent calculus, made the task somewhat intractable. The solution was to introduce an explicit composition rule into the calculus of natural deduction

<sup>34</sup> It remains to be seen whether this apparently quite cumbersome and indirect notion of bidirectionality amounts to anything useful beyond a mere proof of principle.

(indeed, two such rules with fipped premises were required). The possibility to suspend derivations following the application of a rule in natural deduction allowed for the derivation of the general implication elimination rule from the standard one. However, this enhancement only served to distinguish four of the original alternatives (namely those that would translate into the standard elimination rule) from four others (those that would translate into the general elimination rule). Furthermore, a mechanism was required that restricts the application of rules in such a way that certain premises must be guaranteed to stand proud as assumptions. Assigning these proudness markers to any one of the two premises of either the standard or the general implication elimination rule resulted in eight distinct natural deduction rules refecting the character of the eight left implication rules of the sequent calculus. An analysis of the new natural deduction rules revealed that apart from the standard and the general implication elimination rules, only one of the new rules retains completeness of the calculus, namely the rule of proud implication elimination. It is this rule that should be considered as the defnitive translation of the standard left implication rule in the sequent calculus. As such, it not only signifcantly improves the translations of sequent calculus derivations into natural deduction, it also introduces the paradigm of bidirectionality or assumptive reasoning into the framework of natural deduction.

The obvious next step is a comprehensive investigation of bidirectional natural deduction employing proud elimination rules for all logical constants as well as the rules of explicit composition.

**Acknowledgements** Supported by the DFG project "Paul Hertz and his foundations of structural proof theory" (DFG AR 1010/2-1).

#### **References**


Smullyan, R. M. (1968). Analytic cut. *Journal of Symbolic Logic* 33, 560–564.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Focusing Gentzen's LK Proof System**

Chuck Liang and Dale Miller

**Abstract** Gentzen's sequent calculi LK and LJ are landmark proof systems. They identify the structural rules of weakening and contraction as notable inference rules, and they allow for an elegant statement and proof of both cut elimination and consistency for classical and intuitionistic logics. Among the undesirable features of those sequent calculi is that their inferences rules are low-level and frequently permute over each other. As a result, large-scale structures within sequent calculus proofs are hard to identify. In this paper, we present a diferent approach to designing a sequent calculus for classical logic. Starting with Gentzen's LK proof system, we examine the *proof search* meaning of his inference rules and classify those rules as involving either *don't care nondeterminism* or *don't know nondeterminism*. Based on that classifcation, we design the *focused* proof system LKF in which inference rules belong to one of two phases of proof construction depending on which favor of nondeterminism they involve. We then prove that the cut rule and the general form of the initial rule are admissible in LKF. Finally, by showing that the inference rules for LK are all admissible in LKF, we can give a relative completeness proof for LKF provability with respect to LK provability. We shall also apply these properties of the LKF proof system to establish other meta-theoretic properties of classical logic, including Herbrand's theorem.

**Key words:** sequent calculus, Gentzen's LK, focused proof system, LKF, polarization, cut elimination

Chuck Liang

© The Author(s) 2024 275 T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_9

Department of Computer Science, Hofstra University, Hempstead, NY, United States of America, e-mail: cscccl@hofstra.edu

Dale Miller Inria & LIX/Ecole Polytechnique, Palaiseau, France, e-mail: dale.miller@inria.fr

#### **1 Introduction**

In his attempt to prove the *Hauptsatz* (cut elimination) for both intuitionistic and classical logics, Gentzen (1935) moved away from natural deduction to the sequent calculus. The sequent calculus allowed him to introduce the structural rules of weakening and contraction: their use on the right-hand side of sequents was fundamental to capturing both classical and intuitionistic logics in one framework. If we are only interested in proving cut elimination and consistency, then the sequent calculus, as Gentzen presented it, is a great tool. If, however, we wish to apply logic and proof theory to, say, computation, then Gentzen's sequent calculus has some signifcant problems: we discuss four such problems in Section 2.

In earlier work (Liang and Miller, 2009), we have presented the *focused proof system* LJF as an improved version of Gentzen's sequent system LJ for intuitionistic logic. Such focused proof systems have been used to give a foundation to logic programming (Miller, 1989; Miller, Nadathur, Pfenning, and Scedrov, 1991), model checking (Heath and Miller, 2019), and term representation (Herbelin, 1995; Scherer, 2016).

This paper examines a *focused* version of the LK sequent calculus proof system, called LKF. The key properties of LKF — cut elimination and relative completeness for LK — have been proved elsewhere (Liang and Miller 2009; 2011) by using complex and indirect arguments involving linear logic (Girard, 1987), a focused proof system for linear logic due to Andreoli (1992), and the focused proof system LJF. Here, we present LKF from frst principles: we make no use of intuitionistic or linear logics nor the meta-theory of other proof systems. Additionally, proof transformations here, including those for cut elimination, should be easier to formalize in, say, proof assistants than the more abstract arguments used elsewhere (including in Liang and Miller, 2011).

After presenting the LK inference rules, we describe some of the shortcomings of that proof system in Section 2. In Section 3, that criticism of LK motivates the design of LKF. We then prove the following results about LKF.


Taken together, these results prove that LKF is complete for LK. A similar proof outline for proving the relative completeness of focused proof systems has been used by Laurent (2004) for linear logic, by Chaudhuri, Pfenning, and Price (2008) for an intuitionistic version of linear logic, and by Simmons (2014) for a propositional intuitionistic logic. The proofs of these meta-theoretic results for LKF rely almost exclusively on tedious arguments about the permutability of inference rules. One of the design goals for LKF has been to build a calculus that can be used directly to prove other proof-theoretic results without the need to involve such tedious permutation arguments. We illustrate this principle by proving the admissibility of cut in cut-free LK (Section 9.1) and by proving Herbrand's theorem (Section 9.3): both proofs do not explicitly involve permutation arguments.

Structural rules

$$\begin{array}{ccc} \frac{\Gamma, \mathsf{B}, \mathsf{B} \vdash \Delta}{\Gamma, \mathsf{B} \vdash \Delta} \; cL & \begin{array}{c} \Gamma \vdash \Delta, \mathsf{B}, \mathsf{B} \\ \Gamma \vdash \Delta, \mathsf{B} \end{array} \; c\mathsf{R} & \begin{array}{c} \Gamma \vdash \Delta \\ \Gamma, \mathsf{B} \vdash \Delta \end{array} \; \mathsf{w}L & \begin{array}{c} \Gamma \vdash \Delta \\ \Gamma \vdash \Delta, \mathsf{B} \end{array} \; \mathsf{w}\mathsf{R} \end{array}$$

Identity rules

$$\begin{array}{ccc}\hline\hline\overline{\boldsymbol{B} \vdash \boldsymbol{B}} & \operatorname{init} & \begin{array}{c} \Gamma \vdash \Delta,\boldsymbol{B} \quad \Gamma,\boldsymbol{B} \vdash \Delta'\\\hline\Gamma,\Gamma' \vdash \Delta,\Delta' \end{array} & \operatorname{cut} \\\hline\end{array}$$

Introduction rules

Γ, ⊢ Δ Γ, <sup>1</sup> ∧ <sup>2</sup> ⊢ Δ ∧ Γ ⊢ Δ, Γ ⊢ Δ, Γ ⊢ Δ, ∧ ∧ Γ ⊢ Δ, *t t* Γ, ⊢ Δ Γ, ⊢ Δ Γ, ∨ ⊢ Δ ∨ Γ, *f* ⊢ Δ *f* Γ ⊢ Δ, Γ ⊢ Δ, <sup>1</sup> ∨ <sup>2</sup> ∨ Γ ⊢ Δ, Γ, ⊢ Δ ′ Γ, Γ ′ , ⊃ ⊢ Δ, Δ ′ ⊃ Γ, ⊢ Δ, Γ ⊢ Δ, ⊃ ⊃ Γ, [/] ⊢ Δ Γ, ∀. ⊢ Δ ∀ Γ ⊢ Δ, [/] <sup>Γ</sup> <sup>⊢</sup> <sup>Δ</sup>, <sup>∀</sup>. <sup>∀</sup> Γ, [/] ⊢ Δ Γ, ∃. ⊢ Δ ∃ Γ ⊢ Δ, [/] <sup>Γ</sup> <sup>⊢</sup> <sup>Δ</sup>, <sup>∃</sup>. <sup>∃</sup>

**Fig. 1** The rules for LK. In the ∀ and ∃ rules, the variable is not free in the conclusion. In the ∧ and ∨ rules, ∈ {1, 2}. In ∀ and ∃, is a frst-order term.

#### **2 The** LK **proof system**

Formulas for frst-order classical logic are defned as follows. Atomic formulas are of the form (1, . . . , ), where ≥ 0, is a predicate of arity , and 1, . . . , is a list of frst-order terms. Formulas are built from atomic formulas using both the logical connectives ∧, *t*, ∨, *f* , ⊃ as well as the two frst-order quantifers ∀ and ∃. We shall assume the usual treatment of bound variables and substitutions: in particular, the expression [/] denotes the result of performing a capture-avoiding substitution of term for all free occurrences of the variable in the formula .

Figure 1 presents the LK sequent proof calculus of Gentzen (1935). Inference rules are between *sequents* which are pairs of multisets of formula, formally written with an infx ⊢. The rules there are divided into introduction rules, structural rules, and *identity rules*. Note that the rules in this latter group, namely the *init* and the *cut* rules, require checking that two occurrences of a formula, here , on diferent sides of a sequent or in diferent sequents, have the same identity (e.g., are equal). Note also that the frst important results about the LK sequent calculus imply that completeness is maintained if almost all of the identity rules are eliminated: one must only retain occurrences of the *init* rule where is atomic.

The main diferences between the proof system in Figure 1 and Gentzen's presentation of LK are the following.

1. In Gentzen's system, contexts are lists of formulas, and the exchange rule, which allowed two adjacent formulas to be swapped, was used. In Figure 1, contexts (Γ and Δ) are multisets of formulas, and the exchange rule is not used.


For this paper, we shall make the following distinction between proof and derivation. By *proof*, we mean a tree structure of inference rules and sequents such that all premises are closed, in the sense that the inference rules at the leaves have zero premises (such as the initial rule). By *derivation*, we mean a similar tree structure of inference rules and sequents, but we do not assume that all leaves are closed: derivations can have unproved premises.

Gentzen's sequent calculus was designed to support the proof of cut elimination (for both classical and intuitionistic logics). As we suggested in the introduction, sequent calculus is difcult to apply in a number of application areas. We describe four major shortcomings of the LK sequent calculus.

#### **2.1 The collision of cut and the structural rules**

Consider the following instance of the cut rule.

$$(\dagger) \qquad \qquad \qquad \qquad \frac{\Gamma \vdash \mathcal{C} \qquad \Gamma', \mathcal{C} \vdash \mathcal{B}}{\Gamma, \Gamma' \vdash \mathcal{B}} \text{ cut.} $$

If the right premise is proved by a left-contraction rule from Γ ′ , , ⊢ , then cut elimination proceeds by permuting the *cut* rule to the right premises, yielding the derivation

$$\begin{array}{cc} \Gamma \vdash \mathcal{C} & \Gamma', \mathcal{C}, \mathcal{C} \vdash \mathcal{B} \\\hline \hline \Gamma \vdash \mathcal{C} & \Gamma, \Gamma', \mathcal{C} \vdash \mathcal{B} \\\hline \hline \Gamma, \Gamma, \Gamma' \vdash \mathcal{B} \\\hline \hline \Gamma, \Gamma' \vdash \mathcal{B} \\\hline \end{array} \begin{array}{c} \Gamma', \mathcal{C}, \mathcal{C} \vdash \mathcal{B} \\\hline \ cut \\\hline \hline \end{array} \begin{array}{c} cut \\ \\ \hline \end{array} \end{array}$$

(An inference fgure written with double lines indicates possibly several applications of the rules listed as its justifcation.) In the intuitionistic variant of the sequent calculus, it is not possible for the occurrence of in the left premise of (†) to be contracted since two formulas are not allowed on the right of the sequent arrow. If the cut inference in (†) takes place in the classical proof system LK, it is possible that the left premise is the conclusion of a contraction applied to Γ ⊢ , . In that case, cut elimination can also proceed by permuting the cut rule to the left premise.

$$\begin{array}{c} \begin{array}{c} \Gamma \vdash C, C \quad \Gamma', C \vdash B \\ \hline \Gamma, \Gamma' \vdash C, B \end{array} \begin{array}{c} \text{cut} \\ \hline \end{array} \begin{array}{c} \Gamma', C \vdash B \\ \hline \end{array} \begin{array}{c} \Gamma', C \vdash B \\ \hline \end{array} \begin{array}{c} \text{cut} \\ \hline \end{array} \end{array} \begin{array}{c} \text{cut} \\ \hline \end{array} \end{array}$$

Thus, in LK, it is possible for both occurrences of in (†) to be contracted and, hence, the elimination of cut is nondeterministic since the cut rule can move to both the left and right premises.

Such nondeterminism in cut elimination is even more pronounced when we consider the collision of the cut rule with weakening. Consider the derivation (taken from Girard, Taylor, and Lafont, 1989, Appendix B).

$$
\begin{array}{c}
\Xi\_1 \\
\Vdash^\perp B \\
\Vdash^\perp C, B \\
\Vdash^\perp B, B \\
\Vdash^\perp B
\end{array}
\begin{array}{c}
\Xi\_2 \\
\Vdash^\perp B \\
\Vdash^\perp B \\
\Vdash^\perp C
\end{array}
$$

cut elimination here can yield either Ξ<sup>1</sup> or Ξ2: thus, nondeterminism arising from weakening can lead to completely diferent proofs of . This kind of example does not occur in the intuitionistic (single-sided) version of the sequent calculus.

These problems with cut elimination and the structural rules were noted in Danos, Joinet, and Schellinx (1997) and by Lafont in Girard, Taylor, and Lafont (1989). Lafont concludes that in order to avoid this problem with cut elimination, one can choose from among two solutions: either make the sequent calculus asymmetric (leading to intuitionistic logic where the structural rules are not available on the right) or forbid all structural rules (leading to linear logic where structural rules are not available on the left and right). It is possible, however, to remain in classical logic by employing a third solution that uses both *polarization* and *focused proof systems*. Such an approach was proposed by Girard (1991) in his LC proof system and by Danos, Joinet, and Schellinx (1997) in their LK proof system. In this paper, we present the LKF proof system, which is also based on the notions of polarization and focusing. As we shall see, the problems with the nondeterminism in cut elimination caused by the use of structural rules in classical logic disappear in LKF for two reasons. First, weakening will be allowed only in the initial rules of LKF where it cannot cause problems with cut elimination. Second, a cut takes place between two *polarized formulas* of opposite *polarity* and, in LKF, contraction is only applied to positive formulas.

#### **2.2 Permutations of inference rules**

A dominating feature of sequent calculus proofs in LK is that many pairs of inference rules permute over each other (Kleene, 1952). For example, when an occurrence of ⊃ is below ∀, as in the derivation

$$\frac{\Gamma\_1 \vdash B, \Delta\_1}{\Gamma\_1, \Gamma\_2, \mathcal{C} \vdash \forall \mathbf{x}. D, \Delta\_2} \; \forall \mathsf{R}$$

$$\frac{\begin{array}{c} \Gamma\_1 \vdash B, \Delta\_1 \end{array} \; \mathsf{F} \vdash \forall \mathbf{x}. D, \Delta\_2 \; \mathsf{V} \mathsf{R} \\ \hline \Gamma\_1, \Gamma\_2, B \supset \mathcal{C} \vdash \forall \mathbf{x}. D, \Delta\_1, \Delta\_2 \end{array} \supset \mathsf{L},$$

the order of these two rules can be switched to form the derivation

$$\frac{\begin{array}{l}\Gamma\_{1}\vdash B,\Delta\_{1} \quad \Gamma\_{2},C\vdash\ [\mathsf{y}/\mathsf{x}]D,\Delta\_{2}\\\hline\Gamma\_{1},\Gamma\_{2},B\supset C\vdash\ [\mathsf{y}/\mathsf{x}]D,\Delta\_{1},\Delta\_{2}\\\hline\Gamma\_{1},\Gamma\_{2},B\supset C\vdash\forall x.D,\Delta\_{1},\Delta\_{2}\end{array}}{\forall R.}$$

Similarly, the following two deviations are such that permuting the inference rules in one derivation yields the other derivation.

$$\frac{\frac{\Gamma,\,\,B\_{i},\,C\_{j}\vdash\Delta}{\Gamma,\,\,B\_{i},\,C\_{1}\land C\_{2}\vdash\Delta}}{\frac{\Gamma,\,\,B\_{i},\,C\_{1}\land C\_{2}\vdash\Delta}{\Gamma,\,\,B\_{1}\land B\_{2},\,C\_{1}\vdash\Delta}}\qquad\frac{\frac{\Gamma,\,\,B\_{i},\,C\_{j}\vdash\Delta}{\Gamma,\,\,B\_{1}\land B\_{2},\,C\_{j}\vdash\Delta}}{\frac{\Gamma,\,\,B\_{1}\land B\_{2},\,C\_{1}\land C\_{2}\vdash\Delta}{\Gamma,\,\,B\_{1}\land B\_{2},\,C\_{1}\land C\_{2}\vdash\Delta}}$$

If one is trying to fnd structure in sequent calculus proofs, then it is likely that both of these pairs of derivations should be identifed in some way.

The existence of such permutations of inference rules suggests that uncovering structures in proofs will always be disturbed by the possibilities of such shallow rearrangements of inference rules. For such reasons, people have often argued that the "essence" of proof structures is better captured in some radically diferent proof systems, such as, for example, expansion trees (Miller, 1987), proof nets (Girard, 1987; Laurent, 2011), and atomic fows (Guglielmi and Gunderson, 2008). In this paper, we also replace Gentzen-style sequent calculus with something else, namely LKF, but this time, that replacement will still resemble sequent calculus but with more structure added to both sequents and inference rules.

An introduction rule of LK is *invertible* if whenever there is an LK proof of its conclusion, there are LK proofs of the premises. When attempting to build a proof from the bottom-up, invertible rules can always be applied without losing provability. If an introduction rule is not invertible, it is *non-invertible*. The LK introduction rules can be classifed as follows: the invertible rules are ∧, *t*, ∨, *f* , ⊃ , ∀, ∃ while the non-invertible rules are ∧, ∨, ⊃ , ∀, ∃. Note that every connective has an invertible introduction rule on one side of the ⊢, and every occurrence of the corresponding introduction rule on the other side is non-invertible. (This last statement is vacuously true for *t* and *f* since they have zero introduction rules on the left and right, respectively.) Observing the invertibility of introduction rules allows us to give some structure to the permutation of inference rules. In particular, an invertible rule above any other rule can always be permuted down. Furthermore, two non-invertible rules, one above the other, can always be permuted as well.

We make one additional observation: if an occurrence of a non-atomic formula on the left or right of a sequent can be the consequence of an invertible rule, that formula occurrence never needs to have a structural rule applied to it. For example, the contraction-left rule never needs to be applied to a disjunction since the disjunction-left rule is invertible.

These three observations about invertible and non-invertible rules — the left-right duality regarding invertibility; the permutations involving invertible and non-invertible rules; and the connection between invertible rules and the structural rules — will all be made explicit in the design of the LKF proof system.

$$\begin{array}{c} \begin{array}{c} \Gamma, \mathsf{B}\_{1}, \mathsf{B}\_{2} \vdash \mathsf{A} \\ \Gamma, \mathsf{B}\_{1} \land \mathsf{B}\_{2} \vdash \mathsf{A} \end{array} \land L \qquad \begin{array}{c} \begin{array}{c} \Gamma \vdash \mathsf{A} \\ \Gamma, \mathsf{t} \vdash \mathsf{A} \end{array} \land L \end{array} \qquad \begin{array}{c} \begin{array}{c} \Gamma \vdash \mathsf{A}, \mathsf{B} \quad \Gamma' \vdash \mathsf{A}', \mathsf{C} \\ \Gamma, \Gamma' \vdash \mathsf{A}, \mathsf{A}', \mathsf{B} \land \mathsf{C} \end{array} \land \mathsf{R} \qquad \begin{array}{c} \begin{array}{c} \Gamma \vdash \mathsf{A} \end{array} \land \mathsf{B} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \Gamma \vdash \mathsf{A}, \mathsf{B} \quad \Gamma' \vdash \mathsf{A}', \mathsf{C} \\ \Gamma \vdash \mathsf{t} \end{array} \land \mathsf{R} \qquad \begin{array}{c} \begin{array}{c} \Gamma \vdash \mathsf{A} \end{array} \land \mathsf{B} \end{array} \end{array}$$

**Fig. 2** The introduction rules for conjunction, disjunction, and their units using multiplicative instead of additive rules.

#### **2.3 Additive and multiplicative rules and connectives**

The LK rules that have two premises can be classifed as either *additive*, in which case the side formulas (Γ, Δ) are the same in the conclusion as well as in both premises, or *multiplicative*, in which case the side formulas in the premises (Γ, Δ and Γ ′ , Δ ′ ) are accumulated to form the side formulas in the conclusion. Of the four inference rules in Figure 1 with two premises, the *cut* rule and the implication-left rule are multiplicative while the disjunction-left rule and the conjunction-right rule are additive.

Consider the alternative inference rules in Figure 2 for conjunction and disjunction. The rules in that fgure with two premises are multiplicative. We can make the following observations.


Although Gentzen used the additive rules for conjunction and disjunction, there are reasons to admit other choices. For example, it is a popular choice to select the invertible right introduction rules for both conjunction and disjunction, which means selecting the additive conjunction and the multiplicative disjunction. Ketonen introduced such a variant of Gentzen's original calculus and used it to give "a strikingly elegant proof of completeness" (von Plato, 2012). People working in automated theorem proving often use the invertible rules since it simplifes implementations of proof search. In particular, it is possible to defne one-sided sequent systems for classical logic in such a way that all (right) introduction rules are invertible except for the existential introduction rule. As a result, proof search algorithms can limit backtracking to only the treatment of existential quantifers.

The LKF proof system contains both the additive and multiplicative versions of conjunction and disjunction (and their units).

#### **2.4 The need for synthetic inference rules**

Our fnal criticism of LK is that its inference rules are too small, especially for applications involving theories. For example, assume that we are working with a theory (a set of assumptions) that has an axiom that declares that the binary predicate *path* is transitive: that is, that the theory contains the formula

$$\forall \mathbf{x} \forall \mathbf{y} \forall z \ (path(\mathbf{x}, \mathbf{y}) \supset path(\mathbf{y}, z) \supset path(\mathbf{x}, z)).$$

If that formula is invoked in an LK proof, there will be a minimum of fve introduction rules involved in that invocation. That seems unfortunate since it is more natural to view that formula as denoting one of the following inference rules.

$$\frac{\Gamma \vdash \Delta, \operatorname{path}(\mathbf{x}, \mathbf{y}) \quad \Gamma \vdash \Delta, \operatorname{path}(\mathbf{y}, \mathbf{z})}{\Gamma \vdash \Delta, \operatorname{path}(\mathbf{x}, \mathbf{z})}$$

or

$$\frac{path(\mathbf{x}, \mathbf{y}), \ path(\mathbf{y}, \mathbf{z}), \ path(\mathbf{x}, \mathbf{z}), \Gamma \vdash \Delta}{path(\mathbf{x}, \mathbf{y}), \ path(\mathbf{y}, \mathbf{z}), \Gamma \vdash \Delta}$$

These *synthetic rules* would be a more appropriate way to invoke the transitivity axiom. Such synthetic rules have been addressed before in the literature, particularly as a back-chaining inference rule (Hallnäs and Schroeder-Heister, 1990; Miller, Nadathur, Pfenning, and Scedrov, 1991) or as a forward-chaining inference rule (Negri and von Plato, 1998). One of the immediate applications of LKF is as a formal framework for computing and justifying the addition of such synthetic inference rules to LK.

#### **3 The** LKF **proof system**

The LKF proof system does not deal with formulas but with *polarized formulas*: these are built from atomic formulas and negated atomic formulas (collectively called literals), and *polarized logical connectives* as well as the frst-order quantifers ∀ and ∃. The polarized logical connectives come in two favors: the *positive connectives* are + , ∨ + , + , ∧ + , and ∃ while the *negative connectives* are − , ∧ − , − , ∨ − , and ∀.

Literals are also assigned a polarity as follows. An *atomic bias assignment* is a function (·) that maps atomic formulas to the set of two tokens {+, −}: if () is + then that atomic formula is positive and if () is − then that atomic formula is negative. We extend (·) to literals by setting (¬) to be the opposite polarity of (). We may ask that all atomic formulas are positive, that they are all negative, or we can mix polarity assignments. In particular, the atomic bias assignment + (·) assigns all atoms the positive polarity while − (·) assigns all atoms the negative polarity. We shall often suppress explicit reference to atomic bias assignments, assuming that they have been specifed and fxed. The only restriction we impose on atomic bias assignments is that they are stable under substitution: that is, for all atomic formulas

 and every frst-order variable and term , then () = ( [/]). This restriction is equivalent to saying that the value of (·) is determined by the predicate that is the top-level symbol of : that is, if and ′ are two atoms formed with the same predicate, then () = ( ′ ).

A polarized formula is *positive* if it is a positive literal or its top-level connective or quantifer is positive (i.e, it is of the form ∧ <sup>+</sup> , ∨ <sup>+</sup> , ∃., <sup>+</sup> or + ); similarly, a polarized formula is *negative* it is a negative literal or its top-level connective or quantifer is negative (i.e, it is of the form ∧ <sup>−</sup> , ∨ <sup>−</sup> , ∀., <sup>−</sup> or − ).

Polarized formulas are in *negation normal form* (nnf), meaning that there are no occurrences of implication ⊃ and that the negation symbol ¬ has only atomic scope. When the negation symbol ¬ is used with the non-atomic polarized formulas of LKF, we shall view it as the following function that transforms that polarized formula to its De Morgan dual.

**Defnition 3.1** The negation symbol ¬ is defned as the following function when applied to non-atomic polarized formulas.

1. ¬¬ = for atomic formula 2. ¬( ∧ <sup>+</sup> ) = ¬ ∨ <sup>−</sup> ¬, ¬( ∨ <sup>−</sup> ) = ¬ ∧ <sup>+</sup> ¬ 3. ¬( ∨ <sup>+</sup> ) = ¬ ∧ <sup>−</sup> ¬, ¬( ∧ <sup>−</sup> ) = ¬ ∨ <sup>+</sup> ¬ 4. ¬∃. = ∀.¬, ¬∀. = ∃.¬

It is easily shown that ¬¬ = for all polarized formulas . Clearly, negation is treated diferently between unpolarized formulas (where it is an abbreviation for "implies false") and polarized formulas (where it computes the De Morgan dual).

The sequent calculus LKF for polarized formulas is presented in Figure 3: this presentation is a simplifcation of our original presentation given in Liang and Miller (2009). This proof system uses one-sided sequents, but of two varieties, namely, ⊢ Γ ⇑ Θ and ⊢ ⇓ Θ, where Γ is a multiset of polarized formulas, Θ is a set of polarized formulas, and is a single polarized formula. The up and down arrows separate sequents into two *zones*: the zone on the right of the arrows (written using Θ) is called the *storage* for that sequent. In notation such as ⊢ Γ, Γ ′ ⇑ Θ, Θ ′ , the multiset Γ, Γ ′ represents the multiset sum of Γ and Γ ′ while the set Θ, Θ ′ represents the union of the two sets Θ and Θ ′ : it is, of course, possible for Θ and Θ ′ to share a non-empty intersection. When moving a collection of polarized formulas from the left of the ⇑ into storage, we coerce multisets into sets in the obvious way. Note that by inspection, the storage of the sequent in the conclusion of an inference rule is always a subset of storage of the sequents in the premises. We say that the polarized formula has an LKF proof if the sequent ⊢ ⇑ · has a proof using the inference rules from Figure 3.

Before completing the details of the LKF proof system, we informally describe its relationship to Gentzen's LK proof system. Just as polarized formulas are essentially regular formulas with some additional structure added (the + and − annotations), LKF sequents are essentially LK sequents with additional structure. That extra structure is the establishment of two zones within a sequent, namely, the storage zone and the non-storage zone. Additionally, LKF sequents come in two kinds, as is witnessed by the use of either ⇑ or ⇓. Thus, if one wishes to relate LKF sequents to Gentzen's original sequents, one only needs to forget this additional structure. In particular, the

**Fig. 3** The inference rules for LKF. Here, is a positive polarized formula and is a positive literal; is a negative polarized formula and is a positive polarized formula or negative literal. The rule for ∀ has the usual eigenvariable restriction: is not free in any polarized formula in the concluding sequent.

arrows ⇑ and ⇓ can be replaced by a comma and all the polarization annotations on polarized formulas can be deleted.

We borrow the terminology *asynchronous* and *synchronous* rules from Andreoli (1992). A derivation composed only of asynchronous introduction rules (see Figure 3) and the *store* rule will be called an *asynchronous phase*, and a derivation composed only of synchronous introduction rules and the *init* rule will be called a *synchronous phase*. The sequents in an asynchronous phase all involve ⇑-sequents while the sequents in a synchronous phase all involve ⇓-sequents. An LKF proof is composed of alternations of these two kinds of phases. In particular, the *decide* rule connects a synchronous phase above its premise with an asynchronous phase below its conclusion, and the *release* rule connects an asynchronous phase above its premise with a synchronous phase below its conclusion.

The asynchronous phase can be used to encapsulate what is often called *don't care nondeterminism*. That is, if we consider the asynchronous phase as a large scale inference rule having a sequent of the form ⊢ ⇑ Θ as its conclusion and sequents of the form ⊢ · ⇑ Θ ′ as its premises, then that large scale rule is independent of the sequence of rule applications within the asynchronous phase (see Lemma 3.4). On the other hand, the synchronous phase is a sequence of applications of inference rules with choices (particularly for the ∨ + and ∃ introduction rules), and diferent choices will yield diferent synchronous phases: such phases, therefore, capture *don't know nondeterminism*.

While the weakening and contraction rules are not explicitly given in LKF, both of these rules occur implicitly. The *decide* rule does an implicit contraction on the polarized formula : hence, the only polarized formulas contracted in an LKF proof are positive polarized formulas. The *init* and the + rules do implicit weakening

on the polarized formulas in Θ: thus weakening is available for positive polarized formulas and negative literals. Thus, a negative, non-literal polarized formula is never weakened nor contracted: such polarized formulas are treated *linearly*, in the sense of linear logic (Girard, 1987).

Polarized formulas in the storage zone play two diferent roles in proof search. With the *decide* rule, a positive polarized formula in the storage is *simultaneously* contracted and made available to introduction rules. On the other hand, with the *init* rule, a negative literal in storage is available to end the proof. No other kind of polarized formula will occur in storage.

The four binary logical connectives of LKF — ∨ + , ∨ − , ∧ + , ∧ <sup>−</sup> — can be classifed using three diferent attributes: positive or negative; additive or multiplicative; and conjunctive or disjunctive. By fxing any two of these attributes, the third attribute is uniquely determined. For example, a connective that is both additive and positive must be the disjunction ∨ + . Note also that the De Morgan dual of a logical connective (in the sense of Defnition 3.1) fips its polarity and conjunctive/disjunctive status but does not change its additive/multiplicative status. The introduction rule for ∧ + looks additive since the storage Θ in the conclusion and the premises are all the same. The essential multiplicative character of ∧ + is not apparent in this proof system in which there can be only one focused polarized formula in a sequent. In Section 10, we present a *multifocused* version of LKF, and in that enlarged setting, it will be clear that ∧ + is, in fact, a multiplicative connective.

The proof system for LKF given in Figure 3 has no cut rule; thus the proofs built using the rules in Figure 3 are cut-free proofs. Cut-rules for LKF and a cut-elimination theorem will be presented in the next section.

Let be a polarized formula and let ˇ be the *depolarized* version of : that is, ˇ is the unpolarized formula that results from removing the superscript + and − from the logical connectives in . Since is in negation normal form, the formula ˇ might have occurrences of negated atomic formulas, say ¬, and these should be seen as abbreviations for ⊃ *f* . Depolarizing a multiset or set of polarized formulas Γ is the set Γˇ resulting from depolarized the formulas in Γ.

**Theorem 3.2 (Soundness of** LKF**)** *Let be a polarized formula and let* Γ *and* Θ *be a multiset and set, respectively, of polarized formulas. If* ⊢ Γ ⇑ Θ *is provable in* LKF *then* ⊢ Γˇ, Θˇ *is provable in* LK*. If* ⊢ ⇓ Θ *is provable in* LKF *then* ⊢ ,ˇ Θˇ *is provable in* LK*.*

*Proof* This theorem can be proved by a straightforward mutual induction on the structure of (cut-free) LKF proofs. Most cases of this mutual induction are straightforward. For example, the introduction rule for ∨ + in LKF corresponds to the introduction rule for ∨ in LK, while the introduction rule for ∨ − in LKF corresponds to the multiplicative version of the introduction rule for ∨ in Figure 2. The *init* rule in LKF corresponds, however, to the following LK derivation.

$$\frac{\overline{p \vdash p} \stackrel{\text{init}}{\text{ }}}{\stackrel{p \vdash p, f, \check{\Theta}}{\vdash p, \, p \supset f, \, \check{\Theta}}} \, ^{\mathsf{w} \mathsf{R}} \, ^{\mathsf{w} \mathsf{R}}$$

Finally, *decide* in LKF corresponds to the *cR* rule, and *store* and *release* do not contribute to the LK proof. □

The converse of this soundness theorem is more challenging to prove: we shall state and prove such completeness as Theorem 8.4 in Section 8. (Every time we mention completeness theorems in this paper, we shall mean *relative completeness* with respect to another proof system: we will not use the model theory notion of validity in this paper.) In anticipation of that result, we state a version of that completeness theorem here. Let be a frst-order polarized formula, let (·) be any atomic bias assignment, and let be the unpolarized formula ˇ. If is provable in LK (in the sense that ⊢ is provable in LK) then is provable in LKF. A consequence of this completeness theorem is the following: if is an unpolarized formula that is provable in LK, then for every polarized formula (and atomic bias assignment) such that ˇ is , then has an LKF proof. Note that if there are occurrences of propositional connectives in , there are 2 polarized formulas such that ˇ = . Clearly, polarization does not afect provability, but it can have a large impact on the structure of (focused) proofs.

We now state two properties about (cut-free) LKF proofs.

**Lemma 3.3 (Admissibility of Weakening)** *If* ⊢ Γ ⇑ Θ *and* ⊢ ⇓ Θ *are (cut-free) provable and if* Θ ′ *is a set of positive polarized formulas and negative literals then* ⊢ Γ ⇑ Θ, Θ ′ *and* ⊢ ⇓ Θ, Θ ′ *are also provable.*

This lemma is proved easily by induction on the structure of proofs. The proof further shows that weakening also does not afect the structure of proofs in that the same inference rules are applied at each step.

The following lemma captures the fact that the asynchronous phase of inference rules can deal with don't-care-nondeterminism: any polarized formula to the left of the ⇑ can be selected to be processed frst.

**Lemma 3.4** *If there is a (cut-free) proof of* ⊢ , Γ ⇑ Θ *then there is a (cut-free) proof that ends with either an introduction of or a store rule on .*

*Proof* This lemma holds because the asynchronous introduction rules permute over each other in such a way that the same premises remain. The formal proof of this lemma is by induction on the sum of the sizes of formulas in Γ. The size of a formula is the number of occurrences of literals, connectives, and quantifers in the formula. In particular, and ¬ are of the same size. In the base case, Γ is empty, and the result is trivial. For the inductive case, let Γ = , Γ ′ and assume that the sequent ⊢ , , Γ ′ ⇑ Θ is the conclusion of an inference rule which is either an introduction or *store* on . We then proceed to show that the rule can be permuted above the introduction or *store* of . There are several cases to consider.

*Case: and are both either positive formulas or negative literals.* In this case, is *store* on with premise ⊢ , Γ ′ ⇑ Θ, . By inductive hypothesis on the smaller

Γ ′ , the next rule above must be a *store* on , with premise ⊢ Γ ′ ⇑ Θ, , . But clearly we can switch the order of the two *store* rules:

$$
\frac{\vdash \Gamma' \upharpoonright \Theta, A, B}{\vdash B, \Gamma' \upharpoonright \Theta, A} \text{ store}
$$
 
$$
\frac{\vdash A, B, \Gamma' \upharpoonright \Theta}{\vdash A, B, \Gamma' \upharpoonright \Theta} \text{ store}
$$

*Case: is a positive formula or negative literal and is a non-literal negative formula.* In this case, we consider the structure of . For example, if is <sup>1</sup> ∨ <sup>−</sup> 2, then the premise of is ⊢ , 1, 2, Γ ′ ⇑ Θ. Since the size of 1, 2, Γ ′ is smaller than the size of <sup>1</sup> ∨ <sup>−</sup> 2, Γ ′ , the inductive hypothesis provides a proof where the rule above is the *store* rule applied to with premise ⊢ 1, 2, Γ ′ ⇑ Θ, . Starting from that sequent, we can switch the *store* and ∨ − rules, resulting in

$$
\frac{\vdash B\_1, B\_2, \Gamma' \not\models \Theta, A}{\vdash B\_1 \lor \neg B\_2, \Gamma' \not\models \Theta, A} \lor^-
$$

The cases of is − , <sup>1</sup> ∧ <sup>−</sup> 2, ∀.′ and − are similar.

*Case: is a positive formula or negative literal and is a non-literal negative formula.* This case is analogous to the above case. We illustrate with the case that is <sup>1</sup> ∧ <sup>−</sup> 2. Since the rule is *store* on , its premise is ⊢ <sup>1</sup> ∧ <sup>−</sup> 2, Γ ′ ⇑ Θ, . By the inductive hypothesis, the next rule above is the introduction for ∧ − :

$$\frac{\vdash A\_1, \Gamma' \not\vdash \Theta, B \quad \vdash A\_2, \Gamma' \not\vdash \Theta, B}{\vdash A\_1 \land ^- A\_2, \Gamma' \not\vdash \Theta, B} \land^-}{\vdash A\_1 \land ^- A\_2, B, \Gamma' \not\vdash \Theta} \text{ store}.$$

These rules can be permuted to yield the desired form

$$\frac{\vdash A\_1, \Gamma' \not\models \Theta, B}{\vdash A\_1, B, \Gamma' \not\models \Theta} \stackrel{\scriptstyle \vdash A\_2, \Gamma' \not\models \Theta, B}{\vdash A\_2, B, \Gamma' \not\models \Theta} \stackrel{\scriptstyle \vdash \Theta, B}{\land \neg} \stackrel{\scriptstyle \lor \Theta}{\land \neg}.$$

*Case: and are both non-literal negative polarized formulas.* There are several cases to consider, but they are all similar. For example, if and are <sup>1</sup> ∨ <sup>−</sup> <sup>2</sup> and = 1∨ <sup>−</sup> 2, respectively, and the last rule introduces , we just need to show that the two ∨ − -introductions permute over each other, which follows easily from the fact that both proofs can be constructed from the common premise of ⊢ 1, 2, 1, 2, Γ ′ ⇑ Θ. In the case where is <sup>1</sup> ∨ <sup>−</sup> <sup>2</sup> and is <sup>1</sup> ∧ <sup>−</sup> 2, introducing <sup>1</sup> ∧ <sup>−</sup> <sup>2</sup> results in the premises ⊢ <sup>1</sup> ∨ <sup>−</sup> 2, 1, Γ ′ ⇑ Θ and ⊢ <sup>1</sup> ∨ <sup>−</sup> 2, 2, Γ ′ ⇑ Θ, both of which have a smaller inductive measure, which allows us to assume that the next rule above will introduce <sup>1</sup> ∨ <sup>−</sup> <sup>2</sup> and we can therefore build the proof

⊢ , Γ ⇑ Θ ⊢ ¬, Γ ′ ⇑ Θ ′ ⊢ Γ, Γ ′ ⇑ Θ, Θ ′ *cut<sup>u</sup>* ⊢ ⇓ Θ ⊢ ¬, Γ ′ ⇑ Θ ′ ⊢ Γ ′ ⇑ Θ, Θ ′ *cut<sup>f</sup>* ⊢ Γ ⇑ Θ, ⊢ ¬, Γ ′ ⇑ Θ ′ ⊢ Γ, Γ ′ ⇑ Θ, Θ ′ *dcut<sup>u</sup>* ⊢ ⇓ Θ, ⊢ ¬ ⇑ Θ ′ ⊢ ⇓ Θ, Θ ′ *dcut<sup>f</sup>*

**Fig. 4** The Cut Rules of LKF. Here, and are arbitrary polarized formulas and is a positively polarized formula.

$$\frac{\vdash A\_1, A\_2, B\_1, \Gamma' \not\models \Theta \quad \vdash A\_1, A\_2, B\_2, \Gamma' \not\models \Theta}{\vdash A\_1, A\_2, B\_1 \land \neg B\_2, \Gamma' \not\models \Theta} \; \land^{-}$$

The remaining cases are treated similarly. □

**Defnition 3.5** We say that a (cut-free) proof of ⊢ , Γ ⇑ Θ is *eager* with respect to if the last inference rule introduces or is a *store* rule on . We say that the proof is *delayed* with respect to if either


In other words, a proof is delayed with respect to if is only subject to an introduction or *store* rule on when it appears in a conclusion of the form ⊢ ⇑ Θ. Note also that a proof of ⊢ ⇑ Θ is both eager and delayed with respect to .

Lemma 3.4 implies that a proof can be transformed into either the eager or the delayed form.

#### **4 Cut Elimination for** LKF

Given that LKF has two kinds of sequents and each of these has two zones for holding polarized formulas, we introduce in Figure 4 a total of four cut rules in order to state and prove the cut-elimination theorem for LKF. The *cut<sup>u</sup>* rule (called the *unfocused* cut rule) applies only to ⇑-sequents while the *cut<sup>f</sup>* rule (called the *focused* cut rule) involves one ⇓-sequent. Both of those cut rules also have a "delayed" version in which one of the occurrences of the polarized cut formula is "locked" in storage.

It is important to note that in the delayed cuts, the polarized cut formula is positive and not a negative literal: in particular if were a negative literal in the *dcut<sup>f</sup>* rule and if = ¬ then *dcut<sup>f</sup>* is not admissible since focusing on a positive literal requires the proof to end in an initial rule.

A simple observation shows that the cut-rules in Figure 4 do not sufer the collision problems mentioned in Section 2.1. As we noted in the previous section, only positive

polarized formulas are contracted (by the *decide* rule) in LKF proofs: as a result, exactly one of the pair of polarized formulas and ¬ involved in a cut rule will be positive, and only one of them can be contracted. Similarly, weakening only appears within the *init* rule in LKF proofs and, as a result, the problematic case involving weakening also disappears.

The general strategy for proving cut elimination in LKF extended with these cut rules is familiar: we reduce cuts to "key cases" in which the polarized cut formula is principal in both premises. The proof proceeds by simultaneous induction over the permutabilities of all four cuts. The inductive measure is the lexicographical ordering consisting of the size of the polarized cut formula followed by the sum of the heights of the subproofs above the cut. We apply the procedure to the topmost cuts frst, thus assuming that the cuts to be reduced have cut-free subproofs.

Lemma 3.4 is used to simplify the cut-elimination proof. However, the application of this lemma for proof transformation may afect the height of proofs (because of the − rule). These transformations must be applied carefully to preserve the inductive measure. For the cut-elimination proof, we further require that the following conditions be placed on the cut rules.


The third requirement may appear inconsistent with the others when ¬ is negative in *cut<sup>f</sup>* : however, the transition from *cut<sup>u</sup>* or *dcut<sup>u</sup>* to a *cut<sup>f</sup>* only occurs when the cut formula is decomposed into subformulas, which reduces the stronger inductive measure. For the *dcut<sup>f</sup>* rule, the subproof above the negative cut formula ¬ can be considered both eager and delayed with respect to ¬ because it is the only formula to the left of ⇑. By Lemma 3.4, any proof can be transformed into the required forms so that the reducibility of the restricted cuts also implies the reducibility of the unrestricted versions. In other words, before the application of any cut, we can always apply Lemma 3.4 to assume that the subproofs are in the required forms. The cut-elimination arguments will show that all restrictions are preserved when any of the four cut rules are permuted to other cut rules.

We detail the permutation of each of the four cuts. We sometimes do not repeat cases that are obvious, and we generally ignore the quantifers as the frst-order quantifers add nothing to the argument: their treatment is completely standard.

#### **4.1 Permutations of** *cut<sup>u</sup>*

The *cut<sup>u</sup>* rule has the general form, repeated here for convenience:

290 Chuck Liang and Dale Miller

$$
\begin{array}{c}
\vdash A,\Gamma\upharpoonright\Theta\qquad\vdash\neg A,\Gamma'\upharpoonright\Theta'\
\quad\text{cut}\_{u}.
\end{array}
$$

Assume without loss of generality that is positive and, therefore, ¬ is negative. It is also required that the left subproof above *cut<sup>u</sup>* is *eager* with respect to the positive , i.e., it ends in a *store* rule on the cut formula . Furthermore, the right subproof above the negative cut formula ¬, is required to be *delayed* with respect to ¬. These assumptions mean that this cut can be transformed immediately into a *dcutu*:

$$\frac{\vdash \begin{array}{c} \Gamma \upharpoonright \Theta, A \\ \vdash A, \Gamma \upharpoonright \Theta \end{array} \text{store}}{\vdash \begin{array}{c} A, \Gamma \upharpoonright \Theta \end{array} \vdash \neg A, \Gamma' \upharpoonright \Theta'} \begin{array}{c} \vdash \begin{array}{c} \Gamma \upharpoonright \Theta, A \\ \vdash \Gamma, \Gamma' \upharpoonright \Theta, \Theta' \end{array} \text{doc} \begin{array}{c} \vdash \begin{array}{c} \Gamma \upharpoonright \Theta, A \\ \vdash \Gamma, \Gamma' \upharpoonright \Theta, \Theta' \end{array} \text{doc} \begin{array}{c} \vdash \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \text{doc} \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \end{array} \text{doc} \begin{array}{c} \vdash \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \text{doc} \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \end{array} \text{doc} \begin{array}{c} \vdash \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \text{doc} \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \text{doc} \begin{array}{c} \Gamma \upharpoonright \Theta \end{array} \end{array} \text{doc} \begin{array}{c} \Gamma \upharpoonright \Theta \end{array}$$

Clearly the restriction on the *delayed* form of the subproof above the negative cut formula ¬ is preserved for the *dcut<sup>u</sup>* rule. The inductive measure is reduced by the smaller height of the left subproofs above the cut.

#### **4.2 Permutations of** *dcut<sup>u</sup>*

The delayed, unfocused *dcut<sup>u</sup>* rule has the form

$$
\begin{array}{ccc}
\vdash \Gamma \upharpoonright \Theta, P & \vdash \neg P, \Gamma' \upharpoonright \Theta' \\
\hline
\vdash \Gamma, \Gamma' \upharpoonright \Theta, \Theta'
\end{array}
\begin{array}{c}
\text{d}cut\_w
$$

where the cut formula is positive. It is required that the subproof above the right premise is *delayed* with respect to the cut formula ¬. These cuts are permuted to the point where is selected for focus, at which point the cut transforms into a combination of *cut<sup>f</sup>* and *dcut<sup>f</sup>* . In other words, the "goal" or "target" of all permutations of *dcut<sup>u</sup>* is to be able to apply the following transformation when the left premise of the *dcut<sup>u</sup>* is the *decide* rule.

$$\frac{\vdash P \Downarrow \Theta, P}{\vdash \neg \Upsupset \Upsupset \Theta, \Theta'} \vdash \neg P \Downarrow \Theta' \quad \underline{\text{cut}\_{u}} \longrightarrow \frac{\vdash P \Downarrow \Theta, P \vdash \neg P \Downarrow \Theta'}{\vdash P \Downarrow \Theta, \Theta'} \; \underline{\text{cut}\_{f}} \vdash \neg P \Downarrow \; \Theta' \; \; \; \; \; \mathsf{cut}\_{f}.$$

In the transformed proof, the upper *dcut<sup>f</sup>* has subproofs of lesser height measure, while the lower *cut<sup>u</sup>* is a *key case* cut where the cut formula is principal in both subproofs. That is, cut-free proofs for ⊢ ⇓ Θ, Θ ′ and ⊢ ¬ ⇑ Θ ′ must both end with the cut formulas and ¬ subject to an inference rule. The key-case cuts immediately decompose into cuts on subformulas of a smaller size than (or reduces completely by weakening in case of being a positive literal). Thus, the inductive measures of both cuts are reduced.

Note that the *eager* restriction on the right subproof above *cut<sup>f</sup>* is trivially preserved since ¬ is the only polarized formula on the left of ⇑.

All other permutations of *dcut<sup>u</sup>* make progress toward this case. We organize these permutations into two stages.

The frst stage performs permutations over inference rules in the right subproof of *dcutu*. The right subproof above *dcut<sup>u</sup>* ends in ⊢ ¬, Γ ′ ⇑ Θ ′ . We permute *dcut<sup>f</sup>* until it has such a right subproof with an empty Γ ′ . The fact that this subproof is *delayed* with respect to ¬ means that if it ends in a conclusion ⊢ ¬, , Γ ′ ⇑ Θ ′ we can assume that the last rule either introduces or is a *store* rule on (and not on ¬). There are many subcases depending on the form of :

*Case: is a positive polarized formula or negative literal.* In this case, the rule above is a *store* on , resulting in the following permutation.

$$\begin{array}{c} \vdash \Gamma \{\upharpoonright, \mathsf{P}, P \quad \xvdash{\neg P, P, B, \Gamma' \} \mathsf{f} \,\mathsf{O}'} \begin{array}{c} \vdash{\neg P, P, \Gamma' \} \,\mathsf{f} \,\mathsf{O}', B \\{\text{store}} \quad \displaystyle\text{store} \end{array} \displaystyle \begin{array}{c} \vdash{\vdash{\Gamma \} \mathsf{f} \,\mathsf{O}, P \quad \vdash{\neg P, \Gamma' \} \,\mathsf{f} \,\mathsf{O}', B \\{\text{dcut}} \quad \mathsf{\vdash} P, \Gamma' \,\mathsf{f} \,\mathsf{O}, \mathsf{O}', B \\\mathsf{\vdash} B, \Gamma, \Gamma' \,\mathsf{f} \,\mathsf{O}, \mathsf{O}'} \text{store} \end{array} \; \mathsf{d}cut\_{\mathsf{u}} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\mathsf{h} \,\$$

The "delayed" restriction on the right subproof above *dcut<sup>u</sup>* is preserved by defnition: an immediate subproof of a delayed proof is also delayed. This property applies similarly to all subsequent cases.

*Case: is* <sup>1</sup> ∨ <sup>−</sup> 2*.* In this case, we can transform

$$\frac{\vdash \Gamma \not\vdash \Theta, P \quad \frac{\vdash \neg P, B\_1, B\_2, \Gamma' \not\vdash \Theta'}{\vdash \neg P, B\_1 \lor \neg B\_2, \Gamma' \not\vdash \Theta'} \lor \neg}{\vdash B\_1 \lor \neg B\_2, \Gamma, \Gamma' \not\vdash \Theta, \Theta'}$$

into the following derivation.

$$\frac{\vdash \Gamma \not\vdash \Theta, P \quad \vdash \neg P, B\_1, B\_2, \Gamma' \not\vdash \Theta'}{\vdash B\_1, B\_2, \Gamma, \Gamma' \not\vdash \Theta, \Theta'} \; \mathit{dcut}\_{\mu}$$

*Case: is* <sup>1</sup> ∧ <sup>−</sup> 2*.* In this case, we can transform

⊢ Γ ⇑ Θ, ⊢ ¬, 1, Γ ′ ⇑ Θ ′ ⊢ ¬, 1, Γ ′ ⇑ Θ ′ ⊢ ¬, <sup>1</sup> ∧ <sup>−</sup> 2, Γ ′ ⇑ Θ ′ ∧ − ⊢ <sup>1</sup> ∧ <sup>−</sup> 2, Γ, Γ ′ ⇑ Θ, Θ ′ *dcut<sup>u</sup>*

into the following derivation.

$$\begin{array}{c} \vdash \Gamma \upharpoonright \Theta, P \quad \vdash \neg P, B\_1, \Gamma' \upharpoonright \Theta' \\ \hline \vdash B\_1, \Gamma, \Gamma' \upharpoonright \Theta, \Theta' \\ \hline \hline \end{array} \begin{array}{c} \vdash \Gamma \upharpoonright \Theta, P \quad \vdash \neg P, B\_2, \Gamma' \upharpoonright \Theta' \\ \vdash B\_2, \Gamma, \Gamma' \upharpoonright \Theta, \Theta' \\ \hline \vdash \Theta, \Gamma \upharpoonright \Theta, \Theta' \end{array} \begin{array}{c} \mathit{d}cut\_{\mathit{u}} \\ \hline \vdash \mathsf{F} \upharpoonright \Theta, \Theta' \\ \hline \end{array}$$

The other cases for are proved similarly. This stage ends when the right subproof concludes with a sequent of the form ⊢ ¬ ⇑ Θ ′ .

The second stage performs permutation over inference rules in the left subproof of *dcutu*. The cases of asynchronous introduction rules are analogous to the cases

demonstrated above and are equally straightforward. Generally speaking, the permutation of cut above introduction rules is always straightforward. The important cases to point out are the *decide*, *release*, and *store* rules. A *store* rule ending the left subproof is also a trivial case because it cannot afect the cut formula. The interesting case is when the left subproof ends in the form ⊢ · ⇑ Θ, . The rule above this sequent must be *decide*. There are two cases depending on whether or not the polarized formula selected for focus is the cut formula or not. If it is not the cut formula but, say, another formula , then we can permute inference rules as follow.

$$\frac{\vdash \mathcal{Q} \Downarrow \mathcal{Q}, \Theta, P}{\vdash \neg \bigwedge \mathcal{Q}, \Theta, P} \stackrel{decide}{\vdash \neg P \Downarrow \Theta'} \mathsf{et}\_{\mathit{d}cut\_{\mathit{u}}} \longrightarrow \frac{\vdash \mathcal{Q} \Downarrow \mathcal{Q}, \Theta, P \vdash \neg P \Downarrow \Theta'}{\vdash \mathcal{Q} \Downarrow \mathcal{Q}, \Theta, \Theta'} \stackrel{decide}{\mathit{d}cut\_{\mathit{f}}}$$

If the polarized formula selected for focus is , then we have reached the targeted transition to key-case cuts as already described above.

#### **4.3 Permutations of** *dcut<sup>f</sup>*

The general form of *dcut<sup>f</sup>* is

$$
\begin{array}{ccc}
\vdash B \Downarrow \Theta, P & \vdash \neg P \Downarrow \Theta' \\
\vdash B \Downarrow \Theta, \Theta'
\end{array}
\begin{array}{c}
\vdash \neg P \Downarrow \Theta' \\
\text{dcut} \rho
\end{array}
\text{dcut}
\rho
$$

with positive. This cut permutes over synchronous introduction rules until reaching an *init* or *release* rule on its left premise, at which point the cut will transition to a *dcut<sup>u</sup>* with lower subproofs:

$$\begin{array}{c} \vdash \stackrel{B \vdash B \restriction \Theta, P}{\vdash B \Downarrow \Theta, P} \; \stackrel{release}{\vdash \neg P \restriction \Theta'} \; \stackrel{\scriptstyle}{\;} \; \begin{array}{c} \vdash \neg B \restriction \Theta, P \vdash \neg P \restriction \Theta' \\ \hline \vdash \neg B \Downarrow \Theta, \Theta' \end{array} \; \begin{array}{c} \vdash \neg B \Downarrow \Theta, P \vdash \neg P \restriction \Theta' \\ \hline \vdash \neg B \Downarrow \Theta, \Theta' \end{array} \; \begin{array}{c} \; \mathsf{cut}\_{\mu} \; \mathsf{e} \\ \hline \vdash \neg B \Downarrow \Theta, \Theta' \end{array} \; \begin{array}{c} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \end{array} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \mid \neg B \Downarrow \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \; \mathsf{e} \\ \hline \; \mathsf{e} \; \mathsf{e} \; \mathsf{e} \end{array} \; \begin{array}{c} \;$$

Besides the cases of initial rules, all other permutations of *dcut<sup>f</sup>* make progress towards this case. Since ¬ is the only polarized formula to the left of ⇑, the "delayed" requirement of *dcut<sup>u</sup>* is trivially met. The right-side subproof with the negative cut formula stays intact during these permutations. We consider two cases where is a positive polarized formula: the other cases are treated similarly. If is a positive literal, then ⊢ ⇓ Θ, must be the conclusion of an initial rule. Since is also positive, it must be the case that <sup>⊥</sup> ∈ Θ. Thus ⊢ ⇓ Θ, Θ ′ is also the conclusion of an initial rule. If is <sup>1</sup> ∨ <sup>+</sup> 2, then we have the following transformation (here, is 1 or 2):

$$\begin{array}{c} \vdash \mathsf{B}\_{i} \Downarrow \mathsf{O}, P\\ \vdash \mathsf{B}\_{1} \Downarrow \mathsf{'} \mathsf{B}\_{2} \Downarrow \mathsf{O}, P\\ \hline \vdash \mathsf{B}\_{1} \Downarrow \mathsf{'} \mathsf{B}\_{2} \Downarrow \mathsf{O}, \mathsf{O}' \end{array} \quad \begin{array}{c} \vdash \neg \mathsf{P} \Downarrow \mathsf{O}'\\ \hline \hline \vdash \mathsf{B}\_{i} \Downarrow \mathsf{O}, P\\ \hline \vdash \mathsf{B}\_{1} \Downarrow \mathsf{O}, \mathsf{O}' \end{array} \quad \begin{array}{c} \vdash \mathsf{B}\_{i} \Downarrow \mathsf{O}, P \ \vdash \neg P \Downarrow \mathsf{O}'\\ \hline \vdash \mathsf{B}\_{i} \Downarrow \mathsf{O}, \mathsf{O}'\\ \hline \vdash \mathsf{B}\_{1} \Downarrow \mathsf{O}, \mathsf{O}' \end{array} \, dcut \,\mathsf{f} \end{array}$$

#### **4.4 Permutations of** *cut<sup>f</sup>*

The *cut<sup>f</sup>* rule has the general form

$$
\begin{array}{c}
\vdash A \Downarrow \Theta \qquad \vdash \neg A, \Gamma' \restriction \Theta' \quad \mathit{cut}\_f. \bot
\end{array}
$$

It is required that the subproof above the unfocused sequent ⊢ ¬, Γ ′ ⇑ Θ ′ is *eager* with respect to ¬.

If is negative, then the left subproof above *cut<sup>f</sup>* must be the conclusion of a *release* rule, and the cut permutes to a *cut<sup>u</sup>* with shorter subproofs:

$$\frac{\vdash A\uparrow\Theta}{\vdash \Gamma'\uparrow\Theta,\Theta'} \stackrel{\vdash \neg A,\Gamma'\uparrow\Theta'}{\vdash \Gamma'\uparrow\Theta,\Theta'} \;cut\_{\mu}.$$

As for the restrictions on *cutu*, ¬ must be positive if is negative, so the subproof above the positive cut formula stays *eager* with respect to that polarized formula, and the subproof above ⊢ ⇑ Θ is trivially *delayed* above the negative cut formula.

If is positive,then the left subproof above *cut<sup>f</sup>* must be either*init* or an introduction of the cut formula . We illustrate three cases below: the other cases are similar.

1. If is a positive literal then the left premise of *cut<sup>f</sup>* , namely ⊢ ⇓ Θ, is the conclusion of an initial rule with ¬ ∈ Θ. The other, *eager* subproof of ⊢ ¬, Γ ′ ⇑ Θ ′ , must end in a *store* rule on ¬, with premise ⊢ Γ ′ ⇑ Θ ′ , ¬. But since ¬ ∈ Θ, the provability of ⊢ Γ ′ ⇑ Θ, Θ ′ follows from weakening.

2. If is <sup>1</sup> ∨ <sup>+</sup> 2, then ¬ is ¬<sup>1</sup> ∧ <sup>−</sup> ¬2. This key case requires transforming the derivation

$$\begin{array}{c} \vdash A\_{i} \Downarrow \Theta\\ \hline \vdash A\_{1} \lor^{\star} A\_{2} \Downarrow \Theta \end{array} \begin{array}{c} \vdash \neg A\_{1}, \Gamma' \Downarrow \Theta' \quad \vdash \neg A\_{2}, \Gamma' \Downarrow \Theta'\\ \hline \vdash \neg A\_{1} \land^{\star} \neg A\_{2}, \Gamma' \Downarrow \Theta'\\ \hline \vdash \Gamma' \Downarrow \Theta, \Theta' \end{array} \begin{array}{c} \land^{-}\\ \hline \hline \text{cut} \end{array}$$

into the derivation

$$
\begin{array}{ccc}
\vdash A\_{\acute{i}} \Downarrow \Theta & \vdash \neg A\_{\acute{i}}, \Gamma' \restriction \Theta' \\
\vdash \Gamma' \upharpoonright \Theta, \Theta' \\
\end{array}
\operatorname{cut}\_f.
$$

The inductive measure is reduced by the size of the cut formulas. Here we can apply Lemma 3.4 to the subproof above ⊢ ¬ , Γ ′ ⇑ Θ ′ so that it becomes *eager* with respect to (each) ¬ without regard to how the transformation might afect the height of proofs because the lexicographical inductive measure is still reduced. This argument similarly applies to the other key cases.

3. if is <sup>1</sup> ∧ <sup>+</sup> <sup>2</sup> then ¬ is ¬<sup>1</sup> ∨ <sup>−</sup> ¬<sup>2</sup> and the proof is transformed as follows:

$$\begin{array}{c} \vdash A\_1 \Downarrow \Theta \quad \vdash A\_2 \Downarrow \Theta \\ \hline \vdash A\_1 \land ^\star A\_2 \Downarrow \Theta \end{array} \\ \begin{array}{c} \vdash \neg A\_1, \neg A\_2, \Gamma' \Downarrow \Theta' \\ \hline \vdash \neg A\_1 \lor ^\star \neg A\_2, \Gamma' \Downarrow \Theta' \\ \hline \end{array} \\ \begin{array}{c} \vdash \neg \neg A\_1 \Downarrow \neg A\_2, \Gamma' \Downarrow \Theta' \\ \hline \end{array} \\ \begin{array}{c} \vdash \neg \neg A\_1 \Downarrow \Theta, \Theta' \\ \hline \end{array}$$

$$\begin{array}{c} \downarrow\\ \vdash A\_2 \Downarrow \Theta \end{array} \quad \begin{array}{c} \downarrow\\ \vdash A\_1 \Downarrow \Theta \end{array} \quad \begin{array}{c} \vdash \neg A\_1 \Downarrow \Theta \end{array} \quad \begin{array}{c} \vdash \neg A\_1, \neg A\_2, \Gamma' \Upareq \Theta'\\ \vdash \neg A\_2, \Gamma' \Upareq, \Theta' \end{array} \quad \textit{cut}\_f \Downarrow \end{array}$$

The two cuts introduced are both on smaller cut formulas compared to the original cut: the inductive hypothesis is frst applied to the upper cut to obtain a cut-free proof, then to the lower one.

With these permutation results in hand, we can now prove the cut-admissibility theorem for LKF.

#### **Theorem 4.1** *The rules cutu, cut<sup>f</sup> , dcut<sup>u</sup> and dcut<sup>f</sup> are admissible in* LKF*.*

*Proof* The formal proof is a nested induction argument: frst on the number of cuts in each proof, the second on the lexicographical measure for each cut. The corresponding procedure is: select a top-most cut with cut-free subproofs and apply Lemma 3.4 so that the subproofs satisfy the requirements concerning the *eager* and *delayed* properties. Then apply the transformations to reduce the cut. Apply this procedure repeatedly until all cuts are eliminated. □

#### **5 Admissibility of the general** *init* **rule**

The initial rule of LKF requires to be a literal in order to prove the sequent ⊢ ⇓ ¬, Θ. Just as important as the admissibility of cut is the admissibility of the more general form of *init*: that is, the sequent ⊢ , ¬ ⇑ Θ is provable for every polarized formula . For unfocused sequent calculus, the proof of this result is straightforward because of the perfect duality between the introduction rules for dual logical connectives. In particular, assuming that is negative, apply its (invertible) introduction rule followed by the introduction rule for ¬ (reading rules from conclusion to premises). The induction hypothesis can then be applied directly to the premises. In a focused setting, however, the proof becomes more difcult since multiple asynchronous or synchronous connectives are introduced in a single phase. To solve this problem, we introduce the following relation, which was also used in Liang and Miller (2011).

**Defnition 5.1** Let ↑ be the binary relation between a polarized formula and multisets of polarized formulas defned inductively as follows.

1. ↑ {} if is a positive polarized formula or negative literal. 2. <sup>−</sup> ↑ {}. 3. ∨ <sup>−</sup> ↑ Φ, Φ′ if ↑ Φ and ↑ Φ′ . 4. ∧ <sup>−</sup> ↑ Φ if ↑ Φ or ↑ Φ. 5. ∀. ↑ Φ if ↑ Φ.

Clearly each such Φ contains only positive polarized formulas and negative literals. Note that the polarized formulas − and ∨ − − are not ↑-related to any multiset of polarized formulas.

The following lemmas establishthe properties ofthe asynchronous and synchronous phases in a form that allows us to derive the admissibility of the general *init* rule.

**Lemma 5.2** *For all polarized formulas , multisets of polarized formulas* Γ*, and sets of polarized formulas* Θ*, if* ⊢ Φ, Γ ⇑ Θ *is provable for all* Φ *such that* ↑ Φ*, then* ⊢ , Γ ⇑ Θ *is also provable.*

*Proof* The proof is by induction on the size of . If a polarized formula is not ↑-related to any multiset of polarized formulas then we say that ↑ is *undefned* for . Note that if ↑ is undefned for then the lemma implies that ⊢ , Γ ⇑ Θ is provable.

1. If is a positive polarized formula or negative literal, the property is trivial since only ↑ {} holds and Φ contains only .


4. Let be the polarized formula ∧ <sup>−</sup> . If ↑ is undefned for , then it is undefned for and for , and the inductive hypothesis states that ⊢ , Γ ⇑ Θ and ⊢ , Γ ⇑ Θ are provable. Otherwise, if ⊢ Φ, Γ ⇑ Θ is provable for all Φ such that ↑ Φ, then it is provable for all Φ such that ↑ Φ or ↑ Φ. The inductive hypothesis yields the provability of both ⊢ , Γ ⇑ Θ and ⊢ , Γ ⇑ Θ. In either case, the ∧ − rule yields a proof of ⊢ ∧ <sup>−</sup> , Γ ⇑ Θ.

5. Let be the polarized formula ∨ <sup>−</sup> . Assume that ⊢ Φ, Γ ⇑ Θ is provable for all Φ such that ∨ <sup>−</sup> ↑ Φ. This assumption is equivalent to assuming that ⊢ Φ′ , Φ′′ , Γ ⇑ Θ is provable for all Φ′ and Φ′′ such that ↑ Φ′ and ↑ Φ′′ . Now assume that ↑ Φ′ and ↑ Φ′′ hold. By the above hypothesis, we have ⊢ Φ′ , Φ′′ , Γ ⇑ Θ is provable. By the inductive hypothesis applied to , we know that ⊢ , Φ′′ , Γ ⇑ Θ is provable and by the inductive hypothesis applied to , we know that ⊢ , , Γ ⇑ Θ is provable. If ↑ is undefned for either or , we reach the same conclusion. The ∨ − rule thus yields a proof of ⊢ ∨ <sup>−</sup> , Γ ⇑ Θ.

6. Let be the polarized formula ∀. and assume that is not free in Γ, Θ. If ↑ Φ then ↑ Φ. If ↑ is undefned for then it is also undefned for . In either case the inductive hypothesis states that if ⊢ Φ, Γ ⇑ Θ is provable for all Φ such that ↑ Φ, then ⊢ , Γ ⇑ Θ is provable. The property is established by applying the ∀ rule. □

The next lemma connects the synchronous phase with the ↑-relation.

**Lemma 5.3** *For all polarized formulas and multisets of polarized formulas* Φ*, if* ↑ Φ *then* ⊢ ¬ ⇓ Φ *is provable.*

*Proof* The proof proceeds by induction on the size of , which is the same as the size of ¬.


4. If is ∧ <sup>−</sup> then ¬ is ¬ ∨ <sup>+</sup> ¬. Assuming that ↑ Φ then either ↑ Φ or ↑ Φ. Assume without loss of generality that ↑ Φ: by inductive hypothesis ⊢ ¬ ⇓ Φ is provable. Thus, ⊢ ¬ ∨ <sup>+</sup> ¬ ⇓ Φ is provable using the ∨ + rule.

5. If is ∨ <sup>−</sup> then ¬ is ¬ ∧ <sup>+</sup> ¬. Assume that ↑ Φ, Φ′ such that ↑ Φ and ↑ Φ′ . By the inductive hypotheses, we know that ⊢ ¬ ⇓ Φ and ⊢ ¬ ⇓ Φ′ are provable. Apply weakening (Lemma 3.3) to both sequents and we get that ⊢ ¬ ⇓ Φ, Φ′ and ⊢ ¬ ⇓ Φ, Φ′ are provable. Thus ⊢ ¬ ∧ <sup>+</sup> ¬ ⇓ Φ, Φ′ is provable using the ∧ + rule.

6. If is ∀. then ¬ is ∃.¬. If ↑ Φ then ↑ Φ. By inductive hypothesis we have ⊢ ¬ ⇓ Φ and by the ∃ rule, we have ⊢ ∃.¬ ⇓ Φ.

7. If is a positive polarized formula, then the inductive hypothesis also applies to the proper subformulas of ¬, which is negative and of the same size as . Thus if ¬ ↑ Φ then the cases above show that ⊢ ⇓ Φ is provable. By weakening ⊢ ⇓ , Φ is also provable, and we can form the derivation

$$\frac{\vdash A \Downarrow A, \Phi}{\vdash \cdot \uparrow A, \Phi} \frac{decide}{\text{store}}$$

where a sequence of *store* rules are applied to the positive polarized formulas and negative literals in Φ. This holds for all Φ such that ¬ ↑ Φ, so by Lemma 5.2, ⊢ ¬ ⇑ is provable, and by applying the *release* rule, we have a proof of ⊢ ¬ ⇓ . This establishes the property for positive for which only ↑ {} holds. □

The following theorem states the admissibility of the general form of the *init* rule.

**Theorem 5.4** ⊢ , ¬ ⇑ · *is provable for all polarized formulas .*

*Proof* Assume without loss of generality that is positive. Then ↑ {} and Lemma 5.3 states that ⊢ ¬ ⇓ is provable. Since ¬ is negative, this sequent must be the conclusion of a release rule in a cut-free proof, so ⊢ ¬ ⇑ is provable. Applying the store rule on to this sequent gives a proof of ⊢ , ¬ ⇑ ·. □

#### **6 Generalized invertibility**

The following results about the invertibility of the negative introduction rules is now easily proved using the admissibility of cut. The following corollary is the converse of Lemma 5.2.

**Corollary 6.1** *If* ⊢ , Γ ⇑ Θ *is provable and* ↑ Φ*, then* ⊢ Φ, Γ ⇑ Θ *is provable.*

*Proof* Given the assumption ↑ Φ, Lemma 5.3 implies that the sequent ⊢ ¬ ⇓ Φ is provable. Using a cut rule, we therefore have the following proof.

$$\frac{\vdash A,\Gamma\Upparallel\Theta\quad\vdash\neg A\Downarrow\spadesuit\spadesuit\spadesuit}{\vdash\square\Upparallel\Theta,\Phi\quad\text{store.}}\text{ store.}$$

The fnal result follows from the admissibility of cut (Theorem 4.1). □

From the generalized invertibility property and Lemma 5.2, we can derive the invertibility of the individual asynchronous introduction rules.

**Lemma 6.2** *The introduction rules for the negative connectives are invertible; i.e., the provability of the conclusion of each rule implies the provability of all of its premises.*

*Proof* First, consider the case for ∨ − . Assume that ⊢ ∨ <sup>−</sup> , Γ ⇑ Θ is provable and assume that is ↑-related to exactly the multisets Φ<sup>1</sup> , . . . , Φ and that is ↑-related to exactly Φ<sup>1</sup> , . . . , Φ , where , ≥ 0. By the defnition of ↑, we know that ∨ <sup>−</sup> ↑ Φ Φ for each and such that 1 ≤ ≤ and 1 ≤ ≤ . (Note that if either or is 0 then this statement is vacuously true.) Corollary 6.1 implies that ⊢ Φ Φ , Γ ⇑ Θ is provable. By Lemma 5.2, this means that ⊢ , , Γ ⇑ Θ is provable.

To consider the case for ∧ − assume that ⊢ ∧ <sup>−</sup> , Γ ⇑ Θ is provable and (as above) is ↑-related to Φ<sup>1</sup> , . . . , Φ and is ↑-related to Φ<sup>1</sup> , . . . , Φ , where , ≥ 0. Then ∧ ↑ Φ for each such that 1 ≤ ≤ and ∧ ↑ Φ for each such that 1 ≤ ≤ . By Corollary 6.1, this implies that ⊢ Φ , Γ ⇑ Θ is provable for each such that 1 ≤ ≤ and ⊢ Φ , Γ ⇑ Θ is provable for each such that 1 ≤ ≤ . By Lemma 5.2, ⊢ , Γ ⇑ Θ and ⊢ , Γ ⇑ Θ are provable.

The cases for − and ∀ are similar and omitted. □

Given Lemmas 5.2 and 5.3, we often use the following *argument schema* to establish the provability of ⊢ 1, . . . , , Γ ⇑ Θ: If ↑ is undefned for any then Lemma 5.2 already shows that the sequent is provable. Otherwise, assume that for each ∈ {1, . . . , } there is an greater than or equal to 1 such that is ↑-related to exactly Φ<sup>1</sup> , . . . , Φ . Show that for each possible selection of Φ 1 1 , . . . , Φ , the sequent ⊢ Γ ⇑ Θ, Φ 1 1 , . . . , Φ is provable. Then ⊢ 1, . . . , , Γ ⇑ Θ is provable by Lemma 5.2 plus enough applications of the *store* rule to move each member of Φ to the left side of ⇑. Furthermore, if Γ consists of a single positive polarized formula ( can also be in Θ with Γ empty) and ⊢ ⇓ , Θ, Φ 1 1 , . . . , Φ is provable, then using the *decide* rule

$$\frac{\vdash P \Downarrow P, \Theta, \Phi\_1^{k\_1}, \dots, \Phi\_n^{k\_n}}{\vdash \cdot \Uparrow \Theta, \Phi, \Phi\_1^{k\_1}, \dots, \Phi\_n^{k\_n}} \text{ } decide$$

the provability ⊢ 1, . . . , , ⇑ Θ also follows from Lemma 5.2 and the *store* rule. The provability of the focused sequent above *decide* often follows from Lemma 5.3.

;

## **7 Returning to** LK

In this section, we show how the unfocused LK proof system can be faithfully captured within LKF. We do this in three steps: (1) we translate the two-sided proof system LK into a one-sided system; (2) we show that a more general form of contraction is admissible in LKF; and (3) we prove that the unfocused introduction rules of (the one-side version of) LK are admissible in LKF. As a consequence, LKF is complete for LK.

Gentzen's original version of LK used the additive versions of conjunction and disjunction, namely ∧ − and ∨ + , while his implication ⊃ was multiplicative. Gentzen himself noted (Gentzen, 1935, Remark 2.4) that LK is 'dual' in the sense that the left and right inference rules are symmetrical except for ⊃. In LKF, the multiplicative connective ∨ − can be used to encode ⊃ into ¬ ∨ <sup>−</sup> : hence, ¬( ⊃ ) is encoded as ∧ <sup>+</sup>¬. As a result, we can remove implications and negated implications by mapping them to these multiplicative connectives.

**Defnition 7.1** The LK *-polarization* (·)<sup>±</sup> *of classical formulas* is defned as follows (recall that the negation of polarized formulas is given in Defnition 3.1):

1. For any atomic polarized formula , <sup>±</sup> = and (¬) <sup>±</sup> = ¬. 2. ( ∧ ) <sup>±</sup> = <sup>±</sup> ∧ − ± ; ( ∨ ) <sup>±</sup> = <sup>±</sup> ∨ + ± ; *t* <sup>±</sup> = − ; *f* <sup>±</sup> = + 3. ( ⊃ ) <sup>±</sup> = ¬ <sup>±</sup> ∨ − ±

We also assume that all atomic polarized formulas are polarized positively.

Figure 5 contains the inference rules for LKi, a sequent calculus intermediate between LK and LKF in the sense that it is a one-sided sequent calculus that contains polarized formulas but it is not focused. An LK sequent 1, . . . , ⊢ 1, . . . , is represented in this setting as ⊢ ¬<sup>1</sup> ± , . . . , ¬ ± , <sup>1</sup> ± , . . . , ± . Each inference rule of LK is translated directly into this setting: replace each sequent in the premises and conclusion of the rule with their one-sided, polarized versions. Left-introductions rules on are thus represented as one-sided introduction rules on ¬ ± .

**Theorem 7.2** *Let* , ≥ 0 *and let* 1, . . . , , 1, . . . , *be unpolarized formulas. If the sequent* ⊢ ¬<sup>1</sup> ± , . . . , ¬ ± , <sup>1</sup> ± , . . . , <sup>±</sup> ⇑ · *is provable in* LKF *then the sequent* 1, . . . , ⊢ 1, . . . , *is provable in* LK*.*

*Proof* Note that an LKF proof of ⊢ ¬<sup>1</sup> ± , . . . , ¬ ± , <sup>1</sup> ± , . . . , <sup>±</sup> ⇑ · can easily be translated to an LKi proof of ⊢ ¬<sup>1</sup> ± , . . . , ¬ ± , <sup>1</sup> ± , . . . , ± . Such an LKi proof can then be converted to a proof of the two-sided sequent 1, . . . , ⊢ 1, . . . , in LK. In this later transformation, when the multiplicative connectives ∨ − and ∧ + are introduced in the LKi proof, implications are introduced on the right or left in the LK proof. □

We shall now proceed to prove that the rules of LKi are admissible in LKF by presenting new admissible LKF rules derived from the LKi rules. When naming the new admissible LKF rules, we will add parentheses around the name of the LKi rule. For example, the *init* rule of LKi yields the admissible LKF rule

Structural rules and Identity rules

⊢ Δ, , <sup>⊢</sup> <sup>Δ</sup>, *cR* <sup>⊢</sup> <sup>Δ</sup> <sup>⊢</sup> <sup>Δ</sup>, *wR* <sup>⊢</sup> ,  *init* ⊢ , Δ ⊢ ¬, Δ ′ ⊢ Δ, Δ ′ *cut* Introduction rules ⊢ − , Δ − ⊢ , Θ ⊢ , Θ ⊢ ∧ <sup>−</sup> , Θ ∧ − ⊢ + + ⊢ , Θ ⊢ , Θ ′ ⊢ ∧ <sup>+</sup> , Θ, Θ ′ ∧ + ⊢ , Θ ⊢ <sup>1</sup> ∨ <sup>+</sup> 2, Θ ∨ <sup>+</sup> ⊢ Θ ⊢ − , Θ <sup>−</sup> ⊢ , , Θ ⊢ ∨ <sup>−</sup> , Θ ∨ − ⊢ Δ, [/] <sup>⊢</sup> <sup>Δ</sup>, <sup>∀</sup>. <sup>∀</sup> ⊢ Δ, [/] <sup>⊢</sup> <sup>Δ</sup>, <sup>∃</sup>. <sup>∃</sup>

**Fig. 5** The rules for LKi. In the ∀ rule, the variable is not free in the conclusion. In the ∨ + rule, ∈ {1, 2}.

$$
\begin{array}{c}
\begin{array}{c}
\hline
\text{ } & B, \,\neg B, \,\Gamma \,\,\llbracket \,\raisebox{-1.0pt}{\cdot\cdot\text{ }}\text{ } \\
\end{array}
\end{array}
(init) .
\begin{array}{c}
\begin{array}{c}
(init) . \\
\hline
\end{array}
\end{array}
$$

The admissibility of (*init*) follows immediately from Theorem 5.4. The admissibility of (*wR*), namely,

$$\frac{\vdash \Delta \uparrow \Theta}{\vdash B, \Delta \uparrow \Theta, \Theta'} \; (\bowtie R)$$

follows from Lemma 3.3 and a simple induction on the structure of . We delay the proof of the admissibility of the LKi *cut* rule until Section 9.1. We now proceed to prove the admissibility of contraction and the introduction rules of LKi.

Unlike LK and LKi, LKF does not include explicit rules for contraction. In LKF, the rule of contraction is only applied to positive polarized formulas and only within the *decide* rule. We now show that contraction for *all* polarized formulas is admissible in LKF.

**Lemma 7.3** *The following rule is admissible in* LKF *for all polarized formulas .*

$$\frac{\vdash A,A,\Gamma\restriction\Theta}{\vdash A,\Gamma\restriction\Theta} \;(cR).$$

*Proof* Assume that ⊢ , , Γ ⇑ Θ has an LKF proof. Using Lemma 3.4, we can assume that this proof is eager for the frst occurrence of . If is a positive polarized formula or negative literal, then the only rule that can be applied to it is *store*, which means that the sequent ⊢ , Γ ⇑ , Θ has an LKF proof. Again, this sequent has a proof eager for and, thus, must be proved by the *store* rule, which implies that ⊢ Γ ⇑ , Θ has an LKF proof. By using that sequent as the premise of the *store* rule we have an LKF proof of ⊢ , Γ ⇑ Θ.

Consider the cases where is a non-literal negative polarized formula. The case

where is − is immediate. The case where is − follows using Lemma 6.2 twice. If is ∨ <sup>−</sup> then, using Lemmas 3.4 and 6.2 twice, it is the case that ⊢ , , , , Γ ⇑ Θ is provable. The result follows by using the inductive assumption twice along with the ∨ − rule. If is ∧ <sup>−</sup> then, using Lemmas 3.4 and 6.2 twice, it is the case that both ⊢ , , Γ ⇑ Θ and ⊢ , , Γ ⇑ Θ are provable. The result follows by using the inductive assumption twice along with the ∧ − rule. Finally, the case where is universally quantifed is similar and omitted here. □

From the results in the preceding sections, we can show the admissibility of the unfocused introduction rules (corresponding to the rules of LKi) in LKF.

#### **Theorem 7.4 (Admissibility of unfocused introduction rules)** *All the introduction rules of* LKi *are admissible in* LKF*.*

*Proof* Throughout this proof, we use the admissibility of cut combined with the argument schema outlined at the end of Section 6.

The ∨ + -introduction rule for LKi is admissible in LKF in the form

$$\frac{\vdash B\_i,\Gamma\restriction\Theta}{\vdash B\_1\lor^\star B\_2,\Gamma\restriction\Theta}\ (\vee^\star)$$

for ∈ {1, 2}. Admissibility follows from using the admissibility of the *cut<sup>u</sup>* rule in the derivation

$$\frac{\vdash \mathcal{B}\_{i}, \Gamma \restriction \Theta \qquad \vdash \neg \mathcal{B}\_{i}, \ B\_{1} \lor^{\star} B\_{2} \restriction \cdot}{\vdash \mathcal{B}\_{1} \lor^{\star} B\_{2}, \Gamma \restriction \Theta} \; \mathit{cut}\_{\mathsf{u}}.$$

To show the provability of the right premise above the cut we apply the argument schema of Section 6. Let ¬ ↑ Φ<sup>1</sup> , . . . , ¬ ↑ Φ be an exhaustive list of multisets of polarized formulas ↑-related to ¬ , for ≥ 0. If = 0 then the sequent is provable by Lemma 5.2. Otherwise, is positive. For each Φ ( ∈ 1, . . . , ), construct the following subproof

⊢ ⇓ <sup>1</sup> ∨ <sup>+</sup> 2, Φ ⊢ <sup>1</sup> ∨ <sup>+</sup> <sup>2</sup> ⇓ <sup>1</sup> ∨ <sup>+</sup> 2, Φ ∨ + ⊢ · ⇑ <sup>1</sup> ∨ <sup>+</sup> 2, Φ *decide* ⊢ <sup>1</sup> ∨ <sup>+</sup> 2, Φ ⇑ · *store*

The provability of the top sequent follows from Lemma 5.3 and the provability of ⊢ ¬ , <sup>1</sup> ∨ <sup>+</sup> <sup>2</sup> ⇑ · follows from all such subproofs by Lemma 5.2.

The ∧ + -introduction rule for LKi is admissible in LKF in the form

$$
\frac{\vdash A,\Gamma\restriction\Theta\quad\vdash B,\Gamma\restriction\Theta}{\vdash A\land^{+}B,\Gamma\restriction\Theta}\,(\wedge^{+})\,.
$$

This rule is also justifed using the admissibility of *cut<sup>u</sup>* as follows.

Focusing Gentzen's LK proof system 301

$$\begin{array}{c} \vdash B,\Gamma\upharpoonright \varnothing\Theta \quad \frac{\vdash A,\Gamma\upharpoonright\varnothing\Theta \quad \vdash \neg A,\ \neg B,A\ \land^{+}B\ \uparrow\upharpoonright}{\vdash \neg B,A\ \land^{+}B,\Gamma\upharpoonright\varnothing\ \Theta}\ \mathit{cut}\_{\mathsf{u}} \\\hline \begin{array}{c} \vdash A\ \land^{+}B,\Gamma,\Gamma\upharpoonright\varnothing\ \Theta \quad \mathsf{c}\mathsf{d}\mathsf{e}\_{\mathsf{u}} \\\hline \vdash A\ \land^{+}B,\Gamma\upharpoonright\varnothing\ \Theta \end{array} (c\mathsf{R}) \end{array}$$

The provability of the top right sequent uses the argument schema described above: let ¬ ↑ Φ<sup>1</sup> ¬ , . . . , ¬ ↑ Φ and ¬ ↑ Φ<sup>1</sup> ¬ , . . . , ¬ ↑ Φ be exhaustive lists of multiset of set related to ¬ and ¬, respectively. If either or is 0, then the sequent is already provable. Otherwise for each pair Φ ¬ , Φ construct the subproof

$$\frac{\vdash A\Downarrow A\land^{\star}B,\Phi^{i}\_{\neg A},\Phi^{k}\_{\neg B}\quad\vdash B\Downarrow A\land^{\star}B,\Phi^{i}\_{\neg A},\Phi^{k}\_{\neg B}}{\vdash A\land^{\star}B\Downarrow A\land^{\star}B,\Phi^{i}\_{\neg A},\Phi^{k}\_{\neg B}}\ \land^{\star}}{\frac{\vdash\cdot\restriction A\land^{\star}B,\Phi^{i}\_{\neg A},\Phi^{k}\_{\neg B}}{\vdash A\land^{\star}B,\Phi^{i}\_{\neg A},\Phi^{k}\_{\neg B}}\ \landore{e}}\ \mathit{store}.$$

The provability of the top sequents follows from Lemma 5.3 and from these subproofs the provability of ⊢ ¬, ¬, ∧ <sup>+</sup> ⇑ · follows by Lemma 5.2.

To prove the admissibility of the introduction of ∃, we similarly rewrite

$$\begin{array}{c} \vdash A\{\mathsf{s}/\mathsf{x}\}, \Gamma \not\models \Theta\\ \hline \vdash \exists \mathsf{x}. A, \Gamma \not\models \Theta \end{array} (\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{?}}}}}}}}}}}}}}}}}}}} \vdash\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{I}}}}} \cdot \mathsf{\mathsf{\mathsf{I}}} \text{ } \mathsf{\mathsf{\mathsf{I}}}\text{ } \mathsf{\mathsf{\mathsf{I}}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{\mathsf{I}}\text{ } \mathsf{$$

The provability of the right premise again uses the argument schema of Section 6: let ¬[/] ↑ Φ<sup>1</sup> , . . . , ¬[/] ↑ Φ be the exhaustive list of multisets that are ↑-related to ¬[/]. If = 0, then the premise is already provable. Otherwise, for each Φ we have

$$\frac{\vdash A\left[s/x\right]\Downarrow \exists x.A,\Phi^{i}}{\vdash \exists x.A\Downarrow \exists x.A,\Phi^{i}}\;\exists}{\frac{\vdash \cdot \cdot \Uparrow\Downarrow \exists x.A,\Phi^{i}}{\vdash \exists x.A,\Phi^{i}\;\lnot\;\forall}}{\vdash \exists x.A,\Phi^{i}\;\mathsf{\Upamond}\cdot \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{} \;\mathsf{\upkern-1.\neg}{}$$

from which the provability of ⊢ ¬[/], ∃. ⇑ · follows.

The LKi introduction rule for <sup>+</sup> yields the following admissible rule, which can be justifed by the associated LKF derivation.

⊢ + , Γ ⇑ Θ ( + ) −→ ⊢ <sup>+</sup> ⇓ + + ⊢ · ⇑ <sup>+</sup> *decide* ⊢ <sup>+</sup> ⇑ · *store* ⊢ + , Γ ⇑ Θ (*wR*)

The negative introduction rules already apply on the left side of ⇑. Thus every unfocused inference rule can be emulated on the left side ⇑, and the completeness of LKF with respect to the intermediate LKi, and to the original LK is therefore established.

**Theorem 7.5 (Weak completeness of** LKF**)** *If the sequent* 1, . . . , ⊢ 1, . . . , *is provable in* LK *then the sequent* ⊢ ¬<sup>1</sup> ± , . . . , ¬ ± , <sup>1</sup> ± , . . . , <sup>±</sup> ⇑ · *is provable in* LKF*.*

We have labeled this theorem as "weak completeness" since it states that if an unpolarized formula is provable in LK, then there is *some* polarization of that formula (namely (·)<sup>±</sup> ) which is provable in LKF. Theorem 8.4 in the next section is a stronger version of the completeness theorem since it states that *every* polarization of an unpolarized theorem is provable in LKF.

#### **8 Choosing the polarization of formulas**

We are now able to prove that every polarization of a formula provable in LK is provable in LKF. Formally, we say that the polarized formula (together with an atom bias assignment (·)) is a *polarization* of if ˇ is .

We write ≡ to mean that both ⊢ ¬, ⇑ · and ⊢ ¬, ⇑ · are provable. We frst show that the positive and negative versions of each connective are equivalent.

**Lemma 8.1** *For every pair of polarized formulas and , it is the case that* ∨ <sup>+</sup> ≡ ∨ <sup>−</sup> *and* ∧ <sup>+</sup> ≡ ∧ <sup>−</sup> *.*

*Proof* To prove the frst equivalence, we need proofs of ⊢ ¬ ∧ <sup>−</sup> ¬, ∨ <sup>−</sup> ⇑ · and ⊢ ¬ ∧ <sup>+</sup> ¬, ∨ <sup>+</sup> ⇑ · The frst of these is straightforward given the admissibility of the general initial rule. The provability of the second sequent is equally simple given the admissibility of the *unfocused* introduction rules shown in Section 7, as demonstrated by the following derivation.

$$\frac{\vdash \neg A, \ A \not\Vdash\_{\begin{subarray}{c} \vdash \neg A, \ A \lor^{\forall} B \not\Vdash\_{\begin{subarray}{c} \vdash \neg B, \ A \lor^{\forall} B \not\Vdash\_{\begin{subarray}{c} \vdash \neg B, \ A \lor^{\forall} B \not\Vdash\_{\begin{subarray}{c} \vdash \cdot} \cdot \mid \cdot \end{subarray}}}\end{pmatrix}}(\lor^{\star})}{\vdash \neg A, \ A \lor^{\forall} B \not\Vdash\_{\begin{subarray}{c} \vdash \neg B, \ A \lor^{\forall} B \not\Vdash\_{\begin{subarray}{c} \vdash \cdot} \cdot \mid \cdot \end{subarray}}\end{pmatrix}}(\lor^{\star})}(\lor^{\star})}{\vdash \neg A \land^{\star} \neg B, \ A \lor^{\forall} B \not\Vdash\_{\begin{subarray}{c} \vdash \cdot} \cdot \mid \cdot \end{subarray}}\frac{\begin{subarray}{c} (\lor^{\star}) \end{subarray}}\right)}{(\land^{\star})}$$

Showing ∧ <sup>+</sup> ≡ ∧ <sup>−</sup> is similar, and the equivalences between the positive and negative versions of the units are straightforward. □

**Defnition 8.2** Let ◦ represent one of the binary connectives ∨ − , ∨ + , ∧ − , or ∧ + and let be a syntactic variable ranging over arbitrary polarized formulas. Let range over *subformula contexts* which are defned inductively by

$$\mathcal{S} = \{ \cdot \} \mid \mathcal{S} \circ F \mid F \circ \mathcal{S} \mid \exists \mathsf{x}. \mathcal{S} \mid \forall \mathsf{x}. \mathcal{S} ...$$

Here, [·] is a constant denoting a primitive subformula context. The notation [] denotes the polarized formula that results from replacing [·] in with .

**Theorem 8.3** *Let be a subformula context. If* ≡ *then* [] ≡ []*.*

*Proof* We prove the general property: if ⊢ ¬, ⇑ · is provable then for any subformula context , ⊢ ¬[], [] ⇑ · is also provable.

The proof of this property essentially repeats the arguments for eliminating the generalized initial rule. However, instead of replicating Lemmas 5.2 and 5.3, we can take advantage of the admissibility of unfocused rules for the positive connectives.

We proceed by induction on . In the base case, = [·] and the property is immediate. If, instead, = ∨ − ′ then [] = ∨ − ′ [], and ¬[] = ¬ ∧ <sup>+</sup> ¬ ′ []: we construct

$$\frac{\vdash \neg F, F, S'[B] \Downarrow \cdot \quad \vdash \neg S'[A], S'[B], F \Downarrow \cdot}{\vdash \neg F \land^{+} \neg S'[A], \ F, S'[B] \Downarrow \cdot} \; (\land^{+})$$

The left premise follows from the general initial rule admissibility and the right premise is provable by inductive hypothesis (plus weakening). All the other cases are proved similarly. □

**Theorem 8.4 (Strong completeness of** LKF**)** *Let be an unpolarized formula that is provable in* LK *and let be a polarization of . Then is provable in* LKF*.*

*Proof* Let be an unpolarized formula that is provable in LK and let be a polarized version of and let (·) be any atomic bias assignment. By weak completeness (Theorem 7.5), we know that ± is provable in LKF. Since the only diference between ± and are polarized formulas is that the + and − signs on logical connectives might be diferent and, by construction, the atoms in are all given positive bias. Using the equivalences in Lemma 8.1 and Theorem 8.3, we can conclude that is provable, assuming that all atoms are positively biased.

What remains to be shown is that provability is preserved by imposing the atomic bias assignment (·). Translating a proof with a negative atom into one where is considered positive is the same as translating a proof with ¬ considered positive to one where ¬ is considered negative, so we only need to show one direction of the translation. Assume that is considered negative in a proof. A strategy for reconstructing the proof where is considered positive is to use *delays* together with cut. In particular, we defne the polarized formula as the result of replacing every occurrence of in with ∨ − − (and therefore every occurrence of ¬ by ¬ ∧ + + ). The strategy is to show that every proof of ⊢ ⇑ · with considered negative corresponds to a proof of ⊢ ⇑ · with considered positive. Then by the cut rule

$$\frac{\vdash \, ^B \prescript{\delta}{}{\uparrow} \cdot \quad \vdash \, ^\neg \neg \, ^B \prescript{\delta}{}{\downarrow} \cdot \quad \, ^\prime \cdot \quad \, ^\prime \cdot \quad \, ^\prime \cdot \quad \, ^\prime \cdot$$

we derive a proof of without delays and with considered positive. We can easily generalized the proof of a single formula to the proof of a sequent since (by invertibility) a multiset {1, . . . , } is equivalent to <sup>1</sup> ∨ <sup>−</sup> <sup>2</sup> . . . ∨ <sup>−</sup> .

The rules that may have a literal as principal formula are *store*, *release*, *decide*, and *init*. We show how each rule is emulated in a proof of :

1. Both and ¬ can be subject to a store, in which case the emulations are as follow.

⊢ Γ ⇑ , Θ ⊢ , Γ ⇑ Θ *store* −→ ⊢ Γ ⇑ , Θ ⊢ , Γ ⇑ Θ *store* ⊢ , <sup>−</sup> , Γ ⇑ Θ − ⊢ ∨ − − , Γ ⇑ Θ ∨ − ⊢ Γ ⇑ ¬, Θ ⊢ ¬, Γ ⇑ Θ *store* −→ ⊢ Γ ⇑ ¬ ∧ + + , Θ ⊢ ¬ ∧ + + , Γ ⇑ Θ *store*

Thus, in a proof of , will appear on the right side of ⇑ and ⇓ as but will appear as ¬ ∧ + + .

2. The *release* rule is applicable when is considered negative and is still applicable to ∨ − <sup>−</sup> when is considered positive. Since is a literal, the only rule that can apply above *release* is *store*.

⊢ · ⇑ , Θ ⊢ ⇑ Θ *store* ⊢ ⇓ Θ *release* −→ ⊢ · ⇑ , Θ ⊢ ⇑ Θ *store* ⊢ , <sup>−</sup> ⇑ Θ − ⊢ ∨ − <sup>−</sup> ⇑ Θ ∨ − ⊢ ∨ − <sup>−</sup> ⇓ Θ *release*

3. In the *init* rule, is negative: it is emulated as indicated.

$$\begin{array}{rcl} \begin{array}{rcl} \begin{array}{c} \vdash a \Downarrow a,\,a,\,\Theta\\ \hline \vdash \neg a \Downarrow a,\,\Theta\\ \hline \end{array} \begin{array}{c} \begin{array}{c} \vdash a\Downarrow\neg a,\,a,\,\Theta\\ \hline \vdash \neg a\Downarrow a,\,a,\,\Theta\\ \hline \vdash \neg a\Downarrow a,\,\Theta\\ \hline \vdash \neg a\Downarrow a,\,\Theta \end{array} \begin{array}{c} \text{leftder} \\ \hline \vdash t^{+}\Downarrow a,\,\Theta \end{array} \end{array} & \begin{array}{c} \begin{array}{c} \text{init}\\ \hline \vdash t^{+}\Downarrow\text{a},\,\Theta\\ \hline \vdash t^{+}\Downarrow\text{a},\,\Theta \end{array} \end{array} \end{array}$$

4. Finally, when is considered negative, the *decide* rule can only be applied to ¬, and must be preceded from above by an *init*, and so is emulated as follows

$$\begin{array}{ccc} \vdash \neg a \Downarrow \neg a, a, \Theta \\ \vdash \cdot \Uparrow \square a, a, \Theta \end{array} decide \qquad \begin{array}{ccc} \vdash \neg a \land^{\star} t^{\star} \Downarrow \neg a \land^{\star} t^{\star}, a, \Theta \\ \vdash \cdot \Uparrow \land^{\star} t^{\star}, a, \Theta \end{array} decide.$$

The proof of the remaining premise is easy to fnd.

Finally, to show that ⊢ ¬ , ⇑ · is provable with considered positive, we induct on the structure of :

1. If is or ¬, consider the following derivations.

⊢ ⇓ , *init* ⊢ · ⇑ , *decide* <sup>⊢</sup> , <sup>¬</sup> ⇑ · *store* ⊢ , <sup>−</sup> , <sup>¬</sup> ⇑ · − ⊢ ∨ − − , <sup>¬</sup> ⇑ · <sup>∨</sup> − ⊢ ¬ ∧ + <sup>+</sup> ⇓ ¬ ∧ + + , ⊢ · ⇑ ¬ ∧ + + , *decide* ⊢ ¬ ∧ + + , ⇑ · *store*

The proof of ⊢ ¬( ∨ − − ), ⇑ · on the right is preceded from above by the same subproof as in the imitation of *decide* for ¬ ∧ + + .

2. If is ∨ <sup>−</sup> , we apply the admissible unfocused rules to simplify the proof:

$$\frac{\vdash \neg \mathcal{C}^{\delta}, \mathcal{C}, D \not\vdash \cdot \vdash \neg D^{\delta}, \mathcal{C}, D \not\vdash \cdot}{\vdash \neg \mathcal{C}^{\delta} \land^{+} \neg D^{\delta}, \mathcal{C}, D \not\vdash \cdot} \;(\land^{\star})}{\vdash \neg \mathcal{C}^{\delta} \land^{+} \neg D^{\delta}, \mathcal{C} \lor^{-} D \not\vdash \cdot} \;\mathsf{V}^{-}$$

The premises are provable by inductive hypotheses and by weakening. 3. If is ∨ <sup>+</sup> :

$$\frac{\vdash \neg C^{\delta}, C \upharpoonright \cdot}^{\vdash \neg C^{\delta}, C \upharpoonright \cdot} (\vartriangleleft) \quad \frac{\vdash \neg D^{\delta}, D \ \emptyset \cdot}{\vdash \neg D^{\delta}, C \ \psi^{\star} \; D \ \emptyset \cdot} (\vartriangleleft)}{\vdash \neg C^{\delta} \land^{-} \neg D^{\delta}, C \ \psi^{\star} \; D \ \emptyset \cdot}$$

The premises are provable by inductive hypotheses.

4. The cases of ∧ <sup>+</sup> and ∧ <sup>−</sup> are symmetrical to the above. The cases of ∃ and ∀ are also similar and cases where does not appear in follows directly from the admissibility of the general initial rule. □

Pimentel, Nigam, and Neto (2016) give a similar analysis of how changing the polarity of atoms within the intuitionistic focused proof system LJF (Liang and Miller, 2009) afects the structure of such proofs.

#### **9 Four applications of** LKF

Part of the motivation for developing the LKF proof system is that its meta-theory should help in proving other proof-theoretic results about frst-order classical logic. To support this claim, we present four applications of LKF.

#### **9.1 The admissibility of** *cut* **in** LK

We can prove that the admissibility of cut holds for LK given that we have proved cut-admissibility for the more complex proof system LKF. While it is no surprise that this can be done, it is reassuring to see that that result for LK follows so directly from the results for LKF.

#### **Theorem 9.1** *The cut rule for* LK *is admissible in the cut-free fragment of* LK*.*

*Proof* Assume that the sequents Γ ⊢ Δ, and Γ ′ , ⊢ Θ ′ have cut-free LK-proofs. By the weak completeness of LKF (Theorem 7.5), the sequents ⊢ ¬(Γ) ± , (Δ) ± , <sup>±</sup> ⇑ · and ⊢ ¬(Γ ′ ) ± , ¬( ± ), (Δ ′ ) ± ⇑ · both have (cut-free) LKF proofs. By the admissibility of cut for LKF (Theorem 4.1), we know that ⊢ ¬(Γ) ± , ¬(Γ ′ ) ± , (Δ) ± , (Δ ′ ) ± ⇑ · has a (cut-free) LKF proof. Finally, by Theorem 7.2, we know that Γ, Γ ′ ⊢ Δ, Δ ′ has a cut-free LK proof. □

#### **9.2 Synthetic inference rules**

Following up on the suggestion in Section 2.4, we show how to defne larger-scale, synthetic inference rules using the LKF proof system.

A sequent of the form ⊢ · ⇑ Θ is called a *border sequent*. The only LKF proof rule that can have a border sequent as a conclusion is the *decide* rule.

**Defnition 9.2 (Synthetic inference rule)** A *synthetic inference rule* is an inference rule involving only border sequents. They are of the form

$$
\begin{array}{c}
\vdash \cdot \upharpoonright \Theta\_1 \quad \dots \quad \vdash \cdot \upharpoonright \Theta\_n \\
\vdash \cdot \upharpoonright \Theta
\end{array}
$$

which is *justifed* by a derivation of the form

$$
\begin{array}{ccc}
\vdash \cdot \upharpoonright \Theta\_1 & \dots & \vdash \cdot \upharpoonright \Theta\_n \\
\Pi \\
\Pi \\
\end{array}
$$

Here, ≥ 0, and the derivation Π contains exactly one occurrence of the *decide* rule and that occurrence is the last inference rule (having the conclusion ⊢ · ⇑ Θ). If that *decide* rule selects as its focus the polarized formula ∈ Θ, we say that this derivation is a *synthetic inference rule for* .

Consider again using the formula (from Section 2.4)

$$\forall \mathbf{x} \forall \mathbf{y} \forall z. (path(\mathbf{x}, \mathbf{y}) \supset path(\mathbf{y}, z) \supset path(\mathbf{x}, z))$$

as an assumption in a given fxed theory. In the one-sided sequent setting of LKF, consider instead the negation of this assumption, namely,

$$\exists \mathbf{x} \exists \mathbf{y} \exists z. (path(\mathbf{x}, \mathbf{y}) \land^{+} path(\mathbf{y}, z) \land^{+} \neg path(\mathbf{x}, z)).$$

Assuming that this positive polarized formula is a member of Θ, then consider the following derivation.

Focusing Gentzen's LK proof system 307

$$\begin{array}{c} \begin{array}{l} \Sigma\_{1} \\ \vdash path(r,s) \Downarrow \Theta \end{array} \begin{array}{l} \Sigma\_{2} \\ \vdash path(s,t) \Downarrow \Theta \end{array} \begin{array}{l} \Sigma\_{3} \\ \vdash path(r,t) \Downarrow \Theta \end{array} \begin{array}{l} \Sigma\_{4} \\ \vdash path(r,t) \Downarrow \Theta \end{array} \\ \hline \begin{array}{l} \vdash path(r,s) \land \vdash path(r,t) \Downarrow \Theta \end{array} \begin{array}{l} \neg path(r,t) \Downarrow \Theta \\ \vdash path(r,t) \Downarrow \Theta \end{array} \begin{array}{l} \neg path(r,t) \Downarrow \Theta \end{array} \\ \hline \begin{array}{l} \neg path(r,s) \Downarrow \Theta \end{array} \begin{array}{l} \neg path(r,s) \Downarrow \Theta \end{array} \begin{array}{l} \neg path(r,s) \Downarrow \Theta \end{array} \end{array} \begin{array}{l} \neg path(r,s) \Downarrow \Theta \end{array} \end{array}$$

In order to determine the shape of the proofs Ξ1, Ξ2, and Ξ3, we must declare the polarization given to atoms with the *path* predicate. If all such atoms have a negative polarity assigned to them, then both Ξ<sup>1</sup> and Ξ<sup>2</sup> end with the *release* and *store* rules while the proof Ξ<sup>3</sup> must be trivial (just containing the *init* rule) and *path*(, ) must be a member of Θ. We can write the synthetic rule justifed by the above derivation as

$$\frac{\vdash \cdot \upharpoonright path(r, s), \Theta \qquad \vdash \cdot \upharpoonright path(s, t), \Theta}{\vdash \cdot \upharpoonright path(r, t), \Theta}$$

However, if all *path*-atoms have a positive polarity assigned to them, then Ξ<sup>3</sup> ends with the *release* and *store* rules while the proof Ξ<sup>1</sup> and Ξ<sup>2</sup> must be trivial and both ¬*path*(, ) and ¬*path*(, ) must be members of Θ. We can write the synthetic rule justifed by the above derivation as

$$\frac{\vdash \cdot \restriction \neg path(r, s), \neg path(s, t), \neg path(r, t), \Theta}{\vdash \cdot \restriction \neg path(r, s), \neg path(s, t), \Theta} \cdot$$

Note that these synthetic inference rules are the one-sided version of the back-chaining and forward-chaining synthetic inference rules for *path* displayed in Section 2.4.

The paper Marin, Miller, Pmentel, and Volpe (2020) develops the proof theory of synthetic inferences for both classical and intuitionistic logic by using the focused proof systems LKF and LJF. That paper also shows that cut and the general initial rule are both admissible in the LK and LJ proof systems augmented with such synthetic inference rules based on *geometric formulas*.

#### **9.3 Herbrand's theorem**

The completeness of LKF proofs yields a surprisingly simple proof of Herbrand's theorem, particularly the variant of Herbrand's theorem based on formulas with only existential quantifers in prefx position. A richer connection between a more general form of Herbrand's theorem, based on expansion trees (Miller, 1987), and LKF proofs can be found in Chaudhuri, Hetzl, and Miller (2016).

**Theorem 9.3 (Herbrand's theorem)** *Let be an (unpolarized) quantifer-free formula of frst-order classical logic,* ≥ 1*, and* 1, . . . , *be a list of frst-order variables containing all the free variable of . The formula* ∃<sup>1</sup> . . . ∃. *is provable in* LK *if and only if there is an* ≥ 1 *and substitutions* 1, . . . , *for the variables* 1, . . . , *such that* <sup>1</sup> ∨ · · · ∨ *is provable in* LK*.*

*Proof* Let ˆ be a polarized version of in which all logical connectives and units in are polarized negatively. (For convenience, we abbreviate ∃<sup>1</sup> . . . ∃ with ∃¯.) Since ∃.¯ is provable in LK, the sequent ⊢ ∃.¯ ˆ ⇑ · must have an LKF proof, say Ξ. Clearly, the last inference rule of Ξ is the *store* rule with premise ⊢ · ⇑ ∃.¯ ˆ. Given our choice of polarization, it is easy to show that every border sequent in Ξ is of the form ⊢ · ⇑ ∃.¯ ,ˆ L, where L is a set of literals. Thus, there are only two diferent ways that the *decide* rule is applied in Ξ. If the *decide* rule is used with a positive literal, the premise is immediately proved using the *init* rule. Otherwise, the *decide* rule starts the synchronous phase with the choice of ∃.¯ ˆ and the subproof determined by that occurrence of the *decide* rule ends with the following inference rules.

$$\frac{\vdash \hat{\mathcal{B}}\theta \not\vdash \exists \bar{\times}.\hat{\mathcal{B}}, \mathcal{L}}{\vdash \hat{\mathcal{B}}\theta \not\vdash \exists \bar{\times}.\hat{\mathcal{B}}, \mathcal{L}}\; release$$

That is, every non-trivial synchronous phase encodes a substitution. Let ≥ 1 be the number of such non-trivial synchronous phases and let 1, . . . , be the substitutions that those phases encode.

Now let be the polarized formula equal to ˆ <sup>1</sup> ∨ + . . . ∨ <sup>+</sup> ˆ and consider building an LKF proof of ⊢ ⇑ ·. In order to ensure that is polarized positively, if = 1, we take to be ∨ + + (essentially encoding a unary version of the binary ∨ + ). It is now a simple matter to convert the proof Ξ of ⊢ ∃.¯ ˆ ⇑ · into a proof of ⊢ ˆ <sup>1</sup> ∨ + . . . ∨ <sup>+</sup> ˆ ⇑ · by copying the asynchronous phases directly and by replacing all the non-trivial synchronous phase in Ξ as follows.

⊢ ˆ ⇑ ∃.¯ ,ˆ L ⊢ ˆ ⇓ ∃.¯ ,ˆ L *release* ⊢ ∃.¯ ˆ ⇓ ∃.¯ ,ˆ L ∃ × =⇒ ⊢ ˆ ⇑ , L ⊢ ˆ ⇓ , L *release* ⊢ ˆ <sup>1</sup> ∨ + . . . ∨ <sup>+</sup> ˆ ⇓ , L ∨ +

In this way, the phase-by-phase structure of Ξ can be used to build an LKF proof for ⊢ ˆ <sup>1</sup> ∨ + . . . ∨ <sup>+</sup> ˆ ⇑ ·. □

#### **9.4 Hosting other focused proof systems**

Proof systems with focusing-like behaviors can sometimes be hosted inside LKF. Such hosting is usually done by translating unpolarized classical logic formulas into polarized formulas in which *delays* have been inserted. These delays are written as −() and +() and are such that they are both logically equivalent to the polarized formula and are such that −() is negative and +() is positive. The expression −() can be defned to be either <sup>−</sup> ∨ <sup>−</sup> , <sup>−</sup> ∧ <sup>−</sup> , or ∀ (where is not free in ). Similarly, the expression +() can be defned to be either <sup>+</sup> ∨ <sup>+</sup> , <sup>+</sup> ∧ <sup>+</sup> , or ∃ (where is not free in ).

The LKQ and LKT proof systems of Danos, Joinet, and Schellinx (1995) can be seen


**Fig. 6** Two diferent ways to translate classical logic formulas into polarized formulas.

as LKF proofs in which the following polarization functions are used. Figure 6 defnes the left and right translations of unpolarized formulas containing only implications and atoms to polarized formulas. In that fgure, ranges over atomic formulas. It is the case that (cut-free) proofs in LKT of an unpolarized formula using only implications correspond to LKF proofs of () (using the LKT defnition) and (cut-free) proofs in LKQ of an unpolarized formula using only implications correspond to LKF proofs of () (using the LKQ defnition). LKT focuses only on the left and LKQ only on the right of two-sided sequents. These systems are also examples of "less aggressive" focused systems that designate a *"stoup"* formula: these systems impose fewer restrictions than the formula under focus in LKF. The delays emulate the one-sided focusing character of these systems as well as adopt the stoup to a strongly focused system.

#### **10 Other variations for focusing in classical logic**

There have been several variations on focusing systems studied in the literature. In fact, the general phenomena of focusing for classical logic can be seen as arising from Girard's uses of linear logic exponentials to encode classical logic (Girard, 1987) and from Andreoli's discovery of polarity (Andreoli, 1992).

The LKF proof system we have given here can be called a *strongly focused* system: the *decide* rule can only be invoked after *every* negative non-atomic polarized formula has been removed from the sequent. If we do not insist that all negative polarized formulas have been removed in this way, the resulting variant is called a *weakly focused* proof system following Laurent (2004) and Simmons and Pfenning (2011). Girard's LC proof system is an early example of a weakly focused proof system for classical logic (Girard, 1991). A variant on strong focusing is a system where one chooses a predetermined *suspension criterion* and then allows explicitly suspected negative polarized formulas to remain in the conclusion of the (suitably modifed) *decide* rule: suspensions of this kind have proved useful in a setting where logic contains fxed point expressions (Gérard and Miller, 2017).

Let LKFm be the proof system that results from replacing the inference rules for LKF with the extended version of the synchronous introduction rules and the *release* and *decide* rules given in Figure 7. If the ‡ proviso on the *decide* rule requires that the multiset Δ contains exactly one positive polarized formula, then LKFm is the same as Synchronous introduction rules

$$\begin{array}{ccc} \begin{array}{ccc} \begin{array}{c} \vdash B\_{1},\Theta\_{1} \Downarrow \Gamma \quad \vdash B\_{2},\Theta\_{2} \Downarrow \Gamma \\ \vdash B\_{1}\land^{\star}B\_{2},\Theta\_{1},\Theta\_{2} \Downarrow \Gamma \end{array} & \begin{array}{c} \vdash B\_{l},\Theta \Downarrow \Gamma \\ \vdash B\_{1}\lor^{\star}B\_{2},\Theta \Downarrow \Gamma \end{array} \end{array} & \begin{array}{c} \vdash B\_{l},\Theta \Downarrow \Gamma \\ \vdash B\_{1}\lor^{\star}B\_{2},\Theta \Downarrow \Gamma \end{array} \end{array} \end{array} \begin{array}{c} \vdash \begin{array}{c} \vdash \begin{array}{c} \text{\$\vdash\$\mathit{s}/\text{x}\$}\ \textsf{B},\Theta \Downarrow \Gamma \Downarrow \Gamma \end{array} \end{array}$$

Release and decide rules

$$\begin{array}{c} \vdash \Delta \Uparrow \Gamma \\ \vdash \Delta \Uparrow \Gamma \end{array} \begin{array}{c} \vdash \Delta \Uparrow \bar{\Delta}, \Gamma \\ \vdash \cdot \Uparrow \bar{\Delta}, \Gamma \end{array} \begin{array}{c} \vdash \Delta \Uparrow \bar{\Delta}, \Gamma \\ \vdash \cdot \Uparrow \bar{\Delta}, \Gamma \end{array} \begin{array}{c} \mathit{decide}^{\pm} \end{array}$$

The † proviso requires that Δ consists of only negative polarized formulas. In the *decide* rule, Δ is a non-empty multiset of positive polarized formulas and Δ¯ is its underlying set of polarized formulas. The ‡ proviso is discussed in the text.

**Fig. 7** Variations in some of the LKF inference rules.

LKF. It is for this reason that we say that LKF is *single focused*: in such proofs, the zone to the left of the ⇓ always contains exactly one polarized formula (the focus of that sequent). If the ‡ proviso restricts Δ to be just a non-empty set of positive polarized formulas, then the resulting proof system is *multifocused* and that proof system contains more proofs than the single conclusion system. Multifocused proofs were frst considered in Delande and Miller (2008) and Delande, Miller, and Saurin (2010) (in the context of linear logic) and the notion of *maximal multifocused* proofs has been used to describe canonical proof system in linear logic (Chaudhuri, Miller, and Saurin, 2008) and classical logic (Chaudhuri, Hetzl, and Miller, 2016) and to relate sequent calculus proofs to natural deduction proofs (Pimentel, Nigam, and Neto, 2016).

Note that the version of the ∧ + introduction rule in LKFm is not necessarily invertible, while the version of that introduction rule in LKF is invertible: it appears that the true status of ∧ + introduction as belonging to the synchronous phase only becomes apparent in the multifocused setting. Note also that it is immediate to prove the completeness of LKFm given the completeness of LKF.

Two simple changes to the LKF proof system yield a focused proof system for *multiplicative additive linear logic* MALL (Girard, 1987). First, the set of formulas to the right of the double arrows must be changed to multisets. Second, the following four inference rules must replace the corresponding inference rules in LKF (Figure 3).

 atomic ⊢ ⇓ *init* ⊢ ⇓ Γ ⊢ · ⇑ , Γ *decide* <sup>⊢</sup> <sup>+</sup> ⇓ · + ⊢ ⇓ Θ<sup>1</sup> ⊢ ⇓ Θ<sup>2</sup> ⊢ ∧ <sup>+</sup> ⇓ Θ1, Θ<sup>2</sup> ∧ +

Here, the *init* and + rules do not do an implicit weakening, the *decide* rule does not do an implicit contraction, and the side formulas of ∧ + are treated multiplicatively. The resulting proof system, called MALLF in Liang and Miller (2011), is a focused proof system for MALL. Of course, the usual presentation of MALL results from replacing the logical connectives − , + , − , + , ∧ − , ∧ + , ∨ + , and ∨ <sup>−</sup> with <sup>⊤</sup>, **<sup>1</sup>**, <sup>⊥</sup>, **<sup>0</sup>**, &, <sup>⊗</sup>, `, and ⊕, respectively. The fact that this proof system is sound and complete for MALL immediately follows from the results about focusing in full linear logic given by Andreoli (1992).

Another variation on focused proof systems uses a list, not a multiset, of formulas to the left of the ⇑: that is, the order by which the asynchronous inference rules are attempted is proscribed in a fxed fashion. This variation was used by Andreoli (1992) in his frst focused proof system for linear logic.

The LKF proof system was designed to support automated proof checking and proof search (Chihani, Miller, and Renaud, 2017) as well as to provide new means for proving meta-theoretic results for frst-order classical logic (see Section 9). Other researchers, concentrating on the Curry-Howard correspondence (proofs-as-programs) perspective, have designed other variants of focusing for classical logic. In particular, see the LC proof system (Girard, 1991), the LK (Danos, Joinet, and Schellinx, 1995; Danos, Joinet, and Schellinx, 1997), and the proof system used to defne the ¯ ˜-calculus (Curien and Herbelin, 2000).

#### **11 Conclusion**

We have presented the proof system LKF and have proved that it is sound and complete for LK and that the cut rule and the initial rule are admissible. The proofs of these theorems were all done directly using permutation arguments. We have illustrated the utility of LKF by applying it to some standard topics that arise in the proof theory of classical logic. We hope that while the metatheory of LKF was established by tedious permutation arguments, many other properties of proofs in classical logic can be proved by applying LKF directly and without the need for such permutation arguments.

**Acknowledgements** We thank Beniamino Accattoli, Marianna Girlando, and the anonymous reviewers for their comments on an earlier version of this paper.

#### **References**


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Intensional Harmony as Isomorphism**

Paolo Pistone and Luca Tranchini

**Abstract** In the present paper we discuss a recent suggestion of Schroeder-Heister concerning the possibility of defning an intensional notion of harmony using isomorphism in second-order propositional logic. The latter is not an absolute notion, but its defnition is relative to the choice of criteria for identity of proofs. In the paper, it is argued that in order to attain a satisfactory account of harmony, one has to consider a notion of identity stronger than the usual one (based on - and -conversions) that the authors have investigated in recent work.

**Key words:** identity of proofs, equivalence of derivations, second-order logic, parametricity, System F, quantifcation

#### **1 Introduction**

The inferentialist thesis that the meaning of a logical constant is determined by its inferences rules has been famously challenged by Prior (1960) who put forward the following pair of introduction and elimination rules for the binary connective tonk:

$$\frac{A}{A \text{ tonk } B} \text{ tonkI} \qquad \qquad \frac{A \text{ tonk } B}{B} \text{ tonkE}$$

The strong intuition that tonk is semantically defcient has been taken to require a qualifcation of the inferentialist thesis: Not any arbitrary collection of inference rules

Paolo Pistone

Luca Tranchini

© The Author(s) 2024 315

Department of Computer Science and Engineering, University of Bologna, Italy, e-mail: paolo.pistone@uniroma3.it

Department of Computer Science, University of Tübingen, Germany, e-mail: luca.tranchini@gmail.com

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_10

can determine the meaning of a logical constant but only those satisfying a certain requirement, that (following Dummett, 1991) is commonly referred to as *harmony*.

The exact signifcance of harmony is open to interpretation. In particular, there is no agreement as to whether harmony should be regarded as a descriptive or normative criterion; nor as to whether harmony should be regarded as a criterion of "meaningfulness" or merely of "logicality" (i.e., whether expressions governed by rules which are not in harmony should be regarded as meaningless; or as meaningful, but not belonging to the logical vocabulary), or possibly something else. Moreover, diferent characterizations of the notion of harmony have been proposed in the literature, some of them being fully formal, some of them being less precise.

**Harmony via second-order translations.** In the context of natural deduction (Gentzen, 1935; Prawitz, 1965a), harmony is usually described by reference to the "perfect balance" between the introduction and elimination rules of the connectives of intuitionistic propositional logic NI (see Table 1).


**Table 1** The natural deduction system NI.

But what does this "perfect balance" consist in, exactly? In spite of some attempts to answer the question in a precise way (see, e.g., Belnap, 1962; Tennant, 1978; Read, 2010) a frst fully formal defnition of harmony has been proposed only recently by Peter Schroeder-Heister (2014a; 2014b).

In a nutshell, Schroeder-Heister's proposal is that of characterizing collections of introduction rules and collections of elimination rules with a formula of quantifed propositional intuitionistic logic NI<sup>2</sup> , the extension of NI with universal quantifcation over propositions, governed by the following rules:

 ∀I ∀. <sup>∀</sup>. <sup>∀</sup><sup>E</sup> [/]

Schroeder-Heister's proposal is that two collections of introduction and elimination

rules for a connective are in harmony if and only if their characteristic formulas are *interderivable* in NI<sup>2</sup> .

**Towards intensional harmony.** Although this proposal represents a long-awaited step forward in the understanding of harmony, there are reasons of dissatisfaction with it. These hinge upon the fact that collections of rules that are interderivable are characterized by interderivable formulas. For instance, given ∧E1, the rule ∧E<sup>2</sup> is obviously interderivable with the following rule:

$$\frac{A \land B \quad A}{B} \land \text{E}'\_2$$

and the formulas ∧ and ∧ ( ⊃) — characterizing the collections of elimination rules for ∧ consisting in the pairs of ∧E1, ∧E<sup>2</sup> and of ∧E1, ∧E ′ 2 respectively — are obviously interderivable. Thus both pairs of rules qualify as in harmony with ∧I on Schroeder-Heister's criterion. However, one has a clear intuition that the pair ∧E1, ∧E ′ 2 is "less" in harmony with ∧I than the pair ∧E1, ∧E2.

This has prompted the second author (see Tranchini, 2016a) to regard Schroeder-Heister's criterion as a characterization of a "weak" notion of harmony, and to call for a strengthening of it capable of capturing a notion of harmony on which only the pair ∧E1, ∧E<sup>2</sup> (and not the pair ∧E1, ∧E ′ 2 ) qualify as in harmony with ∧I. What distinguishes such a would-be stronger notion of harmony — not just from Schroeder-Heister's weak harmony, but also from other proposals, such as those of Belnap (1962), and of Tennant (1978) — is its (hyper-)*intensional* nature, that is, its being capable of discriminating among collection of rules which are indistinguishable in terms of derivability.

**Intensional inferentialism.** The idea that a semantic framework should be able to draw (hyper-)intensional distinctions has a long tradition, going back at least to Carnap (1956), who notably regarded logical equivalence as too coarse a criterion for synonymy, and proposed instead to characterize synonymy using the notion of intensional isomorphism.

In the context of inferential accounts of meaning, especially in the proof-theoretic semantics tradition of Dummett and Prawitz, (hyper-)intensional aspects have been largely ignored. One of the reasons for this is that these theories of meaning have been shaped in analogy with traditional ones, by replacing the notion of truth conditions with the one of assertibility conditions. Consider for instance a language containing two distinct binary connectives ♯ and ♭ whose inferential behaviour is described as follows: a proof of ♯ is a triple consisting of a proof of , a method to transform proofs of into proofs of and a method to transform proofs of into proofs of ; a proof of ♭ difers from a proof ♯ in that its frst member is a proof of . Clearly, ♯ and ♭ are assertible under the same conditions: Whenever one has a proof of ♯ , one knows that a proof of ♭ could be constructed, and vice versa. However, to be in possession of a proof of ♯ is clearly a diferent epistemic state from being in possession of a proof of ♭ . If we take meaning to consist not only of what Frege called *Bedeutung* (the portion of reality referred to by an expression) but

also of *Sinn* (the epistemic content of an expression), then an inferentialist account of meaning should be able to distinguish between the meanings of ♯ and ♭ .

As a matter of fact, proof theory ofers a wide range of formal tools to analyze such issues, and in the present paper we will discuss the prospects of using some of these tools to deliver an intensional account of the notion of harmony.

**Harmony via isomorphism.** An obvious way of attaining a notion of harmony stronger than Schroeder-Heister's would be to require the characteristic formula of the collection of introduction rules to be *the same as* that of the collection of eliminations. But this would be too much. It is true that on such a strengthening the pair ∧E1, ∧E ′ <sup>2</sup> would not count as in harmony with ∧I. However, neither would ∨E count as in harmony with the pair ∨I1, ∨I2: The characteristic formula (see below for precise defnition) of the former is ∀( ( ⊃ ) ∧ ( ⊃ )) ⊃ while that of the latter is ∨ , i.e., two distinct, though interderivable NI<sup>2</sup> -formulas. In other words, by adopting this notion of harmony (we dub it *strict harmony*) one would be led to deny that the rules of disjunction are harmonious.

The notion of formula isomorphism coming from categorial proof-theory and the study of typed lambda-calculi provides a middle ground between interderivability and identity. Inspired by the work of Došen (see, e.g., Došen, 2003), the second author (see again Tranchini, 2016a) used the notion of isomorphism to clarify the exact sense in which merely weakly harmonious rules are harmful (see also below Section 3) therefore pointing to the relevance of isomorphism for a characterization of harmony.

Diferent options as to defning harmony using isomorphism have been tentatively put forward by Schroeder-Heister (2016). Among the diferent options proposed, there is that of defning strong harmony by replacing interderivability (resp. identity) in the defnition of weak (resp. strict) harmony with that of isomorphism. However, this proposal is discarded as inappropriate, due to the fact that — at least *prima facie* — ∀( ( ⊃ ) ∧ ( ⊃ )) ⊃ and ∨ are not isomorphic in NI<sup>2</sup> .

**Main contribution.** The isomorphism of two formulas in a given system is not an absolute notion, but it is relative to the choice of a notion of identity of proofs (that is, of an equational theory on the derivations of the system). Building on well-established results in the categorial semantics of second-order logic, in recent work the authors have introduced an equational theory stronger than the usual one using a class of equations referred to as -equations (see Tranchini, Pistone, and Petrolo, 2019; Pistone, Tranchini, and Petrolo, 2021; Pistone and Tranchini, 2021). The class of isomorphisms relative to -equivalence is rich enough to overcome the problem mentioned above (in particular ∀( ( ⊃ ) ∧ ( ⊃ )) ⊃ and ∨ are -isomorphic formulas in NI<sup>2</sup> ). Moreover, although the equational theory induced by -equations (together with the standard conversions for NI<sup>2</sup> ) is not maximal on the whole of NI<sup>2</sup> , it is the maximum equational theory of certain weak fragments of NI<sup>2</sup> . Among these fragments there is the one whose formulas correspond to the "encodings" of collections of introduction and elimination rules for propositional connectives. In this paper, we present in an informal way the notion of identity of proofs captured by the -equations and show how the results obtained about them

provide a frm footing for an intensional account of harmony between weak and strict harmony.

#### **2 From reductions and expansions to isomorphism**

The "perfect balance" between introduction and elimination rules that the notion of harmony aims at capturing has been described as obtaining when

what can be inferred from a logically complex sentence by means of the *elimination rules* for its main connective is *no more* and *no less* than what has to be established in order to infer that very logically complex sentence using the *introduction rules* for its main connective.

When the rules for a connective are in harmony, two kinds of deductive patterns can be exhibited.

Patterns of the frst kind are those giving rise to *maximal formulas occurrences*, that is formula occurrences which are the major premise of an application of an elimination rule (i.e., the premise whose main connective is the one to be eliminated) and that are the conclusion of an application of one of the introduction rules. Prawitz (1965b) defned certain operations on derivations called *reductions*. Reductions allow rewriting a derivation into another one thereby getting rid of a single maximal formula occurrence (though new ones may be generated in the process): In the case of conjunction, we have the following two reductions:

$$\begin{array}{ccccc} \stackrel{\scriptstyle \mathcal{D}\_{1}}{A} & \stackrel{\scriptstyle \mathcal{D}\_{2}}{B} & \stackrel{\scriptstyle \mathcal{D}\_{1}}{A} & \stackrel{\scriptstyle \mathcal{D}\_{1}}{\scriptstyle \mathcal{E}} & \stackrel{\scriptstyle \mathcal{D}\_{1}}{A} & \stackrel{\scriptstyle \mathcal{D}\_{2}}{\scriptstyle \mathcal{E}} \\ \hline A \xrightarrow{A \wedge B} & \stackrel{\scriptstyle \mathcal{A}}{\scriptstyle \mathcal{E}}\_{1} & & A & \stackrel{\scriptstyle \mathcal{A}}{A \wedge B} & \stackrel{\scriptstyle \mathcal{A}}{\scriptstyle \mathcal{E}}\_{2} & \stackrel{\scriptstyle \mathcal{E}}{\scriptstyle \mathcal{E}}\_{2} \end{array}$$

Prawitz showed how — by successively applying reductions in a certain order — any given NI-derivation can be to transformed into one in *normal* form, that is one with no maximal formula occurrences.

The other kinds of patterns are those in which the premises of applications of introduction rules have been obtained by applying the corresponding elimination rules. Prawitz (1971) defned operations that are, in a sense, the dual of reductions, called immediate expansions. In the case of conjunction, the expansion looks as follows:

$$\stackrel{\bigotimes}{A \wedge B} \quad \text{expands to} \quad \frac{\stackrel{\bigotimes}{A \wedge B} \quad \stackrel{\bigotimes}{\wedge E\_1} \quad \frac{\stackrel{\bigotimes}{A \wedge B}}{B} \wedge \stackrel{\bigotimes}{\wedge}$$

Prawitz showed that by successively applying expansions it is possible to transform any given normal derivation in NI into one in *long normal* form, i.e., into a derivation in which all minimal formula occurrences (those that are the conclusion of an elimination and the premise of an introduction rule) are atomic.

The reduction and expansion associated to the rules of implication are the following:

$$\begin{array}{ccccc} & & & & \stackrel{\mu}{\mathcal{Q}}' & & & \stackrel{\mathcal{Q}'}{\mathcal{Q}}' \\ & \stackrel{\mathcal{Q}}{\mathcal{Q}} & & & \stackrel{\mathcal{Q}'}{\mathcal{Q}} & \text{ reduces to} & & \stackrel{\mathcal{Q}'}{\mathcal{Q}} \\ (u) \; \frac{\stackrel{\mathcal{B}}{A \supset B} \supset \mathcal{I} & \stackrel{\mathcal{Q}'}{\mathcal{A}} & & \stackrel{\mathcal{Q}}{\mathcal{B}} \\ & & & & \stackrel{\mathcal{Q}}{\mathcal{B}} \\ & & & & \stackrel{\mathcal{Q}}{A \supset B} & \stackrel{\mathcal{Q}}{\mathcal{A}} \\ & & & & (u) \; \frac{\stackrel{\mathcal{Q}}{B}}{A \supset B} \supset \mathcal{I} \end{array}$$

Via the Curry-Howard correspondence between derivations in the implicational fragment of NI and terms of the simply typed -calculus, these rewriting operations on derivations correspond (respectively) to those of -reduction and -expansion on -terms:

$$t(\lambda \mathbf{x}.t) \stackrel{\beta}{\hookrightarrow} t\{s/\mathbf{x}\} \qquad t \stackrel{\eta}{\hookrightarrow} (\lambda \mathbf{x}.t)\mathbf{x}$$

Like in -calculus, reductions and expansions can be used to defne an equivalence relation on natural deduction derivations. Two derivations and ′ are equivalent if and only if one can be obtained from the other by applying a fnite number of times (-)reduction, (-)expansion and their inverse operations to and ′ or their sub-derivations (we indicate that and ′ are -equivalent as ≡ ′ , and more in general given an equivalence relation , we indicate -equivalence as ≡ ′ ).1 As equivalent -terms can be seen as diferent ways of representing the same function, Prawitz (1971) observed — following a suggestion by Martin-Löf — that equivalent derivations can be seen as diferent linguistic representations of the same proof (where proofs are understood as abstract entities informally characterized by the so-called BHK-clauses; see Tranchini 2012; 2016b; 2019).

Given an equivalence relation on derivations, it is possible to use it to defne an equivalence relation on formulas that is, in general, stricter than interderivability and that it is commonly referred to as *isomorphism*. Let be an equivalence relation on derivations of a natural deduction system . Two formulas are -isomorphic (notation ≃ ) if


$$\begin{array}{ccccc} & & [B] & & [A] \\ & \stackrel{\scriptstyle \mathcal{B}\_{1}}{=} & & \stackrel{\scriptstyle \mathcal{B}\_{2}}{\scriptstyle \mathcal{B}\_{2}} & & \stackrel{E}{=} & A \\ B & \stackrel{E}{=} & [A] & & [B] & \stackrel{E}{=} & A \\ & & \stackrel{\scriptstyle \mathcal{B}\_{2}}{B} & & & \stackrel{\scriptstyle \mathcal{B}\_{1}}{A} \\ & & & & A \end{array}$$

<sup>1</sup> We will always implicitly identify derivations up to renaming of discharge indexes, which corresponds to -equivalence on -terms.

The derivation consisting of the assumption of a formula can be viewed as representing the identity function on the set of proofs of . Hence, the second condition of the defnition of isomorphism can be expressed by saying that the two derivations <sup>1</sup> and <sup>2</sup> represent two functions from proofs of to proofs of and vice versa which are the inverse of each other. This in turn means that the set of proofs of and of are in bijection.

Typical examples of -isomorphic formulas in NI are pairs of formulas of the form ( ∧ ) ∧ and ∧ ( ∧), or ( ∧ ) ⊃ and ⊃ ( ⊃ ), whereas typical examples of interderivable but non--isomorphic formulas are pairs of formulas of the form and ∧ , or ∧ and ∧ ( ⊃ ).

The notion of isomorphism has been proposed (notably by Došen, 2003) as a formal counterpart of the informal notion of synonymy, i.e., identity of meaning. Intuitively, interderivability is only a necessary, but not sufcient condition for synonymy. From an inferential perspective, isomorphic formulas can be regarded as synonymous in the sense that:

"they behave exactly in the same manner in proofs: by composing, we can always extend proofs involving one of them, either as assumption or as conclusion, to proofs involving the other, so that nothing is lost, nor gained. There is always a way back. By composing further with the inverses, we return to the original proofs." (Došen, 2003, p. 498)

Clearly, a necessary condition for some notion of -isomorphism not to collapse on that of interderivability is that the equivalence relation used in the defnition is non-trivial (i.e., there must be at least one formula and two derivations of belonging to distinct equivalence classes). In particular, if any two derivations <sup>1</sup> and <sup>2</sup> of any formula from itself were -equivalent, the second condition of the defnition of -isomorphism would be vacuously satisfed.

The notion of -equivalence (and consequently that of -isomorphism) plays a distinguished role in NI, since -equivalence is the maximum non-trivial equivalence relation defnable on NI-derivations. As Došen (2003) and Widebäck (2001) argued, the maximality of an equivalence relation on the derivation of a system can be taken as supporting the claim that it is the *correct* way of analyzing the notion of identity of proofs underlying .

For the {⊃, ∧, ⊤}-fragment of NI, -equivalence and -isomorphism are wellunderstood: the decidability of -equivalence is an immediate consequence of normalization and confuence of -reduction in the {⊃, ∧}-fragment, and its maximality was established by Statman (1983), Došen and Petrić (2001), and Widebäck (2001). Moreover, -isomorphism in this fragment is decidable and it has been fully axiomatized by Solov'ev (1983).

The extension of these results to richer language fragments has proven a difcult task. In presence of disjunction the decidability and maximality of -equivalence have been established only recently by Scherer (2017); the decidability of -isomorphism was established by Ilik (2014). In this case the difculty were due to the form of the -expansion for disjunction:

322 Paolo Pistone and Luca Tranchini

′ [ ∨ ] ′′ expands to ′ ∨ ∨I [ ∨ ] ′′ ∨I [ ∨ ] ′′ (, ) 

(with and fresh for ′ )

which can be seen as the composition of the simpler form of expansion proposed by Prawitz

$$\stackrel{\bigotimes}{A \lor B} \quad \text{expands to} \quad \stackrel{\bigotimes}{A \lor B} \quad \stackrel{\bigotimes}{A \lor B} \quad \stackrel{\bigotimes\_{1}}{A \lor B} \quad \stackrel{\bigotimes\_{1}}{A \lor B} \quad \stackrel{\bigotimes\_{2}}{\vee} \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel{\bigotimes\_{1}}{\vee} \quad \stackrel{\bigotimes\_{2}}{\vee} \quad \stackrel$$

(with and *fresh*2 for )

and a generalization of the permutative conversions used in establishing the subformula property of normal derivations in NI:

$$\begin{array}{ccccc} & & \stackrel{\scriptstyle n}{[A]} & \stackrel{\scriptstyle m}{[B]} & & \stackrel{\scriptstyle n}{[A]} & \stackrel{\scriptstyle m}{[B]} \\ \stackrel{\scriptstyle \mathfrak{G}}{} & \stackrel{\scriptstyle \mathfrak{G}\_{1}}{} & \stackrel{\scriptstyle \mathfrak{G}\_{2}}{} & & \stackrel{\scriptstyle \mathfrak{G}\_{3}}{} & \stackrel{\scriptstyle \mathfrak{G}\_{1}}{} & \stackrel{\scriptstyle \mathfrak{G}\_{2}}{} \\ \stackrel{A \vee B}{} & \stackrel{\scriptstyle C}{} & \stackrel{\scriptstyle C}{} & \stackrel{\scriptstyle \mathcal{E}}{} \stackrel{\scriptstyle \mathfrak{E}}{} \stackrel{\scriptstyle \longlangle n,m \rangle}{} & \stackrel{\scriptstyle \neg \gamma^{\*}}{} & & [\![C]\text{}] & [\![C]\text{]} \\ & & \stackrel{\scriptstyle \mathfrak{G}\_{3}}{} & \stackrel{\scriptstyle \mathfrak{G}\_{2}}{} & \stackrel{\scriptstyle \mathcal{B}}{} & \stackrel{\scriptstyle \mathcal{B}}{} & & \stackrel{\scriptstyle \mathcal{B}}{} \\ & & & D & & & D \\ \end{array}$$

(see, for a discussion, Tranchini 2016a; 2018).3

Whereas the maximality and decidability of -equality also hold in presence of ⊥ (hence for the full language of NI), the decidability of -isomorphism in presence of ⊥ is still an open problem.

#### **3 Weak harmony and its limits**

In order to defne harmony formally, a useful preliminary move is that of identifying rules — that are usually taken to be meta-linguistic schemata — with expressions belonging to an object language of the appropriate kind. Slightly reformulating

<sup>3</sup> In this work, the frst author proposed the existence of reduction and expansions of this more general form as an informal notion of harmony. Although informal, the requirement is enough to rule out "quantum disjunction" (i.e., the connective whose rules are almost the same as those of disjunction in NI, the only diference being that the elimination rule can be applied only when its minor premises depend on no assumption other than those discharged by the rule application; see Dummett, 1991) as disharmonious, since the restriction on the elimination rule blocks the possibility of defning an expansion of the more general form. See also concluding remarks in Section 5.

insights of Schroeder-Heister (2014a; 2014b), we propose to identify the *rules* of an arbitrary natural deduction system with a particular class of *formulas* of the second-order language extending that of the system NI with


More precisely, given denumerably many propositional variables (to be indicated with , , possibly with subscripts) and denumerably many connective variables as above, we defne the set of formulas L, to be indicated with , , , possibly with subscripts, by the following grammar (we use for a sequence of comma-separated formulas):4

$$A ::= X \mid \top \mid \bot \mid A \land A \mid A \lor A \mid A \supset A \mid \dagger^n(A^n) \mid \forall X.A$$

(we will indicate with L<sup>2</sup> the fragment of L lacking variables for connectives, and with L2<sup>⊃</sup> fragment of L<sup>2</sup> lacking all connectives apart from ⊃).

The idea of identifying a rule with a formula may appear odd at frst, but it is actually very natural. For instance, the rules ∨I1, ∨I<sup>2</sup> and ∨E of NI can be identifed with the following L-formulas:


$$(\lor \mathbf{e}) \qquad \qquad \forall XYZ. (\dagger (Y, Z) \land (Y \supset X) \land (Z \supset X)) \supset X$$

Observe that the standard propositional connectives and the universal quantifer are used to "encode" the diferent "structural features" implicit in natural deduction rules (i.e., conjunction "encodes" multiplicity of premises, implication "encodes" the passage from premises to conclusions, and universal quantifcation "encodes" the schematic nature of rules). As we are using L as a meta-language to investigate the notion of rule, disregarding the fact that rules can be associated to a specifc piece of vocabulary, we replaced accordingly the disjunction of the rules of NI with the (binary) connective variable †. In this way, the conjunction of the three formulas (∨i1), (∨i2) and (∨e) above is an L-formula with a free connective variable that we can take to express the predicate "being a disjunction". Similarly, we can identify the rule ⊃E with the formula ∀.(†(, ) ∧ ) ⊃ and ∧I with the formula ∀.( ∧ ) ⊃ †(, ).

More generally, we call *structural formulas* (to be indicated with , 1, . . .) those formulas constructed using only propositional variables and connective variables.

<sup>4</sup> In other words, we are actually working in the fragment of the third-order language of the system <sup>1</sup> of Girard (1986), in which variables for connectives occur only free. Thus the natural deduction system over ℒ consists of the rules of the second-order natural deduction system NI<sup>2</sup> , the extension of Girard's System with primitive rules for ⊤, ⊥, ∧ and ∨.

More precisely, the set of structural formulas is the subset L of L defned as follows (we use for a list of comma-separated structural formulas):

$$\mathcal{S} ::= X \mid \dagger^n(\mathcal{S}^n)$$

and we call *rule formulas* (or simply rules, to be indicated with , 1, . . .) the Lformulas constructed according to the following grammar (we indicate the set of rule formulas as L ):

$$\mathcal{R} ::= \mathcal{S} \mid \forall \overrightarrow{X} \left( \bigwedge\_{i=1}^{n} \mathcal{R}\_{i} \supset \mathcal{S} \right),$$

where ∀ −→ <sup>=</sup> <sup>∀</sup><sup>1</sup> . . . <sup>∀</sup> if > <sup>0</sup> or it is empty otherwise, and <sup>Ó</sup> =<sup>1</sup> <sup>=</sup> <sup>1</sup> <sup>∧</sup> (2∧ (. . .<sup>∧</sup> ) . . .) if > <sup>0</sup> or <sup>Ó</sup> <sup>1</sup> <sup>=</sup> <sup>⊤</sup> otherwise. The *level of a rule* , indicated with ℓ() is the maximum number of nested implications in , so that ℓ() = 0 and <sup>ℓ</sup>(∀−→ ( Ó =<sup>1</sup> <sup>⊃</sup> )) <sup>=</sup> max(ℓ()) + <sup>1</sup>.

By an *introduction rule* for an -ary connective † we understand a rule of the form

$$(\text{INTRO}) \qquad \qquad \qquad \forall \overrightarrow{X} \left( \land\_{i=1}^{n} R\_{i} \supset \dagger (\overrightarrow{X}) \right)$$

satisfying the following two conditions:


If is an introduction rule for † of the above form, we defne the *content of* (notation ()) to be the L<sup>2</sup> -formula Ó =<sup>1</sup> . If ℐ† = ⟨1, . . . , ⟩ is a list of introduction rules for †, we defne the *content of* ℐ† (notation (ℐ†)) to be the L<sup>2</sup> -formula Ô =<sup>1</sup> (), where <sup>Ô</sup> =<sup>1</sup> <sup>=</sup> <sup>1</sup> ∨ (<sup>2</sup> ∨ (. . . <sup>∨</sup> ) . . .) if > <sup>0</sup> or Ô =<sup>1</sup> <sup>=</sup> <sup>⊥</sup> otherwise.

By an *elimination rule* for an -ary connective † we understand a rule of the form

$$(\mathbf{ELIM}) \qquad \qquad \forall X \forall \overrightarrow{Y} \forall \overrightarrow{X} \left( \left( \dagger (\overrightarrow{X}) \wedge \wedge \wedge \_{1}^{n} R\_{i} \right) \supset X \right)$$

satisfying the following three conditions:


If is an elimination rule for † of the above form, its *content* () is the L<sup>2</sup> -formula ∀∀ −→ Ó =<sup>1</sup> <sup>⊃</sup> . If ℰ† = ⟨1, . . . , ⟩ is a list of elimination rules for †, we defne the *content of* ℰ† (notation (ℰ†)) to be the L<sup>2</sup> -formula Ó =<sup>1</sup> ().

<sup>5</sup> With the exception of this last conditions, the defnitions of introduction and elimination rules follow those given in Schroeder-Heister (2014a). The reason for considering this fnal restriction will be discussed at the end of Section 4.

Suppose now we are given a list ℐ† of introduction rules and a list ℰ† of elimination rules for an -ary connective †. We say that the two collections of rules are in *weak harmony* if and only if the following holds in NI<sup>2</sup> :6

(ℐ†) ⊢⊢ (ℰ†)

For example, the rule ∧I, ∧E<sup>1</sup> and ∧E ′ 2 discussed in the introduction are the formulas (we use the connective variable † for conjunction) ∀.( ∧) ⊃ †(, ), ∀. † (, ) ⊃ and ∀.(†(, ) ∧ ) ⊃ respectively. Thus, the content of the collection of introduction rules for conjunction consisting of ∧I and of the collection of elimination rules for conjunction consisting of ∧E<sup>1</sup> and ∧E ′ 2 are ∧ and ∧ ( ⊃ ) respectively. As these two formulas are intederivable in NI<sup>2</sup> , the two collections of rules are in weak harmony. As the reader can check, the collections of introduction and elimination rules consisting of the rules of NI are in weak harmony as well.

Prawitz (1979) and Schroeder-Heister (1981; 1984; 2014b) proposed a simple method to construct a weakly harmonious collection of elimination rules by "inverting" a collection of introduction rules for a given connective †. In particular, let ℐ† be a sequence of distinct introduction rules of the above form, i.e., ℐ† = ⟨1, . . . , ⟩, with = ∀ −→ Ó =1 ⊃ †(−→) for all 1 ≤ ≤ . The Prawitz-Schroeder-Heister collection of canonical elimination rule associated to ℐ†, to be indicated as PSH(ℐ†) is the list containing only one element, namely the elimination rule ∀∀ −→ . †(−→) ∧ <sup>Ó</sup> =1 Ó =1 ⊃ <sup>⊃</sup> that we indicate with †EPSH(ℐ†). The collections of introduction rules ℐ† and that of elimination rules PSH(ℐ†) are in weak harmony since in NI<sup>2</sup> the content of ℐ† is interderivable with that of PSH(ℐ†) (i.e., with the content of †EPSH(ℐ†)):

$$\left(\bigvee\_{j=1}^{m}\bigwedge\_{i=1}^{n\_j}\mathcal{R}\_{fi}\right)\dashv\vdash\forall X.\left(\bigwedge\_{j=1}^{m}\left(\bigwedge\_{i=1}^{n\_j}\mathcal{R}\_{fi}\supset X\right)\right)\supset X$$

As Schroeder-Heister (2014a) shows, the left-to-right direction of harmony (called "conservativity" criterion) warrants that the addition of the rules for † (understood as meta-linguistic schemata) to a given natural deduction system N yields a conservative extension of N; and that the right-to-left direction warrants the uniqueness of † (where conservativity and uniqueness are understood in the sense of Belnap, 1962).

As shown in Schroeder-Heister (1981) one can defne reduction procedures to get rid of consecutive applications of an introduction rule for a connective followed immediately by the Prawitz-Schroeder-Heister elimination rule, and the same is true for expansions (see Tranchini, 2016a). Actually, reduction and expansions are available not only when the elimination rules follow the pattern of Prawitz and Schroeder-Heister: If two collections of introduction and elimination rules are in weak harmony, it is possible to equip them with expansions and reductions as well. A formalization of this claim hinges on a formal characterization of what sort of

<sup>6</sup> Observe that the two formulas contain no occurrence of connective variables, that is they belong to the proper second-order system NI<sup>2</sup> . Cf. note 4 above.

operations can qualify as reductions and expansions. Here we limit to an informal sketch of how reduction and expansions for harmonious rules can be obtained from those associated to the "canonical pair" consisting of a collection of introduction rule and the Prawitz-Schroeder-Heister collection of elimination rules, and to discuss some examples.

Observe frst that, if a collection of introduction rules ℐ† and a collection of elimination rules ℰ† = ⟨1, . . . , ⟩ are in weak harmony, then the content ℰ is interderivable with the content of the Prawitz-Schroeder-Heister elimination rule and hence the content of each elimination rule in ℰ† is derivable from that of †EPSH(ℐ†). From each possible way of deriving the content of any of the rules in ℰ† from that of †EPSH(ℐ†) one can "extract" a reduction procedure to get rid of consecutive applications of any introduction rule in ℐ† followed immediately by . Moreover, from the derivation of the content of †EPSH(ℐ†) from the content of ℰ† we can "extract" an expansion for the collections of rules ℐ† and ℰ†.

For example, let us reconsider the collection of introduction rules simply consisting of ∧I and the "deviant" collection of elimination rules consisting of ∧E<sup>1</sup> and ∧E ′ 2 that we discussed in the introduction. We can obviously defne the following reductions and expansion:

<sup>1</sup> <sup>2</sup> ∧I <sup>∧</sup> <sup>∧</sup>E<sup>1</sup> −∧<sup>1</sup> ▷ <sup>1</sup> <sup>1</sup> <sup>2</sup> ∧I ∧ ′ <sup>∧</sup><sup>E</sup> ′ <sup>2</sup> −∧′ 2 ▷ <sup>2</sup> ∧ expands to <sup>∧</sup> <sup>∧</sup>E<sup>1</sup> ∧ <sup>∧</sup> <sup>∧</sup>E<sup>1</sup> <sup>∧</sup><sup>E</sup> ′ <sup>2</sup> ∧I ∧

Using these transformation one can prove a normalization theorem and the atomization of minimal formulas in normal derivation for the natural deduction system NI′ obtained by replacing ∧E<sup>2</sup> with the deviant ∧E ′ 2 by opportunely modifying the standard proofs of Prawitz (1965b; 1971) (from which conservativity and uniqueness results for the conjunction governed by these rules follow).

As we remarked in the introduction, however, there are reasons for regarding this as a characterization of a weak notion of harmony, and for looking for a stricter criterion capturing a notion of strong harmony.

The problem with weak harmony is that although one can equip weakly harmonious rules with reductions and expansions, the resulting notion of equivalence on derivation might collapse the notion ofisomorphism on that ofinterderivability. Tranchini (2016a) considers the following collection of weakly harmonious rules:

Intensional Harmony as Isomorphism 327

[] [] ♮<sup>I</sup> ♮ ♮ ♮E<sup>1</sup> ♮ ♮E<sup>2</sup> ♮ ♮E<sup>3</sup> 

In spite of the mismatch between the third premise of the introduction and the conclusion of the third elimination, it is not difcult to devise three reductions and an expansion for these rules as well. However, Tranchini shows that any two derivations of from itself can be shown to belong to the same equivalence class induced by these operations. From this, it immediately follows that any two closed derivations of the same formula are equivalent. Hence, although the rules are weakly harmonious and thus conservative in Belnap's sense (i.e., with respect to derivability), they are non-conservative with respect to identity of proofs. Let ′ be the result of extending a system (including the rules for ⊃) with the rules for ♮, and let ′ be the smallest equivalence relation on derivations of ′ extending an equivalence relation on derivations on (that is closed under the reduction for ⊃) and closed under the reductions and expansions for ♮.7 For any pair of closed derivations <sup>1</sup> and <sup>2</sup> of we have that <sup>1</sup> ′ ≡ 2. Thus if is non-trivial ′ is a non-conservative extension of , in fact a trivial non-conservative extension in which any two derivations (of the same formula) are equated. Similar consideration apply to isomorphism. The notion of isomorphism relative to the equational theory ′ is trivial in the sense that any two interderivable formula qualify as ′ -isomorphic. Even if the notion of isomorphism is non-trivial (i.e even if there are interderivable formulas which are not -isomorphic), ′ -isomorphism is trivial. Thus, we can say that weakly harmonious rules are unacceptable since their addition to a system has the consequence of blurring meaning distinctions.

#### **4 Strong harmony via isomorphism**

As recalled in the introduction, a tighter connection between introduction and elimination rules could be achieved by replacing interderivability with syntactic identity in Schroeder-Heister's defnition of weak harmony. The resulting notion could be dubbed *strict harmony* since not only the standard introduction rule for conjunction and the deviant collection of elimination rules are not in strict harmony, but not even the two standard introductions rules for disjunction and its standard elimination rule are in strict harmony. One may expect that a middle ground between weak and strict harmony–a notion of *strong harmony* — could be obtained by using isomorphism instead of derivability in the defnition of harmony. At frst, it might

<sup>7</sup> The existence of ′ is warranted by the fact that the class of equivalence relations with these properties are closed under infnite intersection.

seem that this move does not bring us very far. If strong harmony is defned using -isomorphism, not only the rules for ♮ would fail to qualify as harmonious, but 

also those for disjunction, since ∨ ; ∀.( ( ⊃ ) ∧ ( ⊃ )) ⊃ .

However, this counter-example does not rule out a defnition of strong harmony based on isomorphism *per se*, but only one based on -isomorphism. Actually, there are independent reasons for adopting a diferent notion of equivalence and of isomorphism when working in NI<sup>2</sup> . Both in NI and NI<sup>2</sup> the maximum non-trivial notion of equivalence can be defned as *contextual equivalence* (notation C E ≡ ). In the setting of natural deduction this notion can be defned as follows: two derivations <sup>1</sup> and <sup>2</sup> of are contextually equivalent if for every derivation of ⊤ ∨ ⊤ such that


the following holds:8

In contrast to what happens in NI, where -equivalence and contextual equivalence coincide (Scherer, 2017), in NI<sup>2</sup> -equivalence is much weaker than contextual equivalence (so much that the former is decidable, while contextual equivalence is undecidable; this is well-known, for a detailed proof of the latter claim see, e.g., Pistone and Tranchini, 2021). Similarly, the notions of isomorphism arising by contextual equivalence is strictly richer than -isomorphisms. In particular ∨ does qualify as CE-isomorphic to ∀.( ( ⊃ ) ∧ ( ⊃ )) ⊃ .

From these considerations, a natural proposal would be that of defning strong harmony by replacing derivability with CE-isomorphism in Schroeder-Heister defnition of weak harmony. The resulting notion would allow to overcome the problems of weak harmony. However, also this notion is not entirely satisfactory. In general, it is very hard to decide whether two derivations of a formula are contextually equivalent, since one has to consider *all* derivations of ⊤ ∨ ⊤ from . Moreover, due to the undecidability of contextual equivalence there are few hopes for the decidability of the notion of CE-isomorphism in NI<sup>2</sup> and hence for that of a notion of strong harmony defned in its terms.

For this reason Tranchini, Pistone, and Petrolo (2019) investigated notions of equivalence lying between -equivalence and contextual equivalence, in the hope of fnding a notion more manageable than CE-equivalence but still suitable for the defnition of the notion of strong harmony. In particular, the authors focused on the

<sup>8</sup> The derivation can be seen as a context, and thus <sup>1</sup> CE ≡ <sup>2</sup> means that <sup>1</sup> and <sup>2</sup> are (-)equivalent in any context of type ⊤ ∨ ⊤. Note that ⊤ ∨ ⊤ is a proposition with exactly two distinct proofs, since ⊤ is the proposition with a unique (trivial) proof, and a proof of a disjunction is a proof of either of the disjuncts (together with a bit of information telling which of the two disjuncts is proven). Thus contextual *inequivalence* means that two proofs can be distinguished in some context in which the possible results of evaluating them are only two.

one arising from the functorial interpretation of second-order formulas (Bainbridge, Freyd, Scedrov, and Scott, 1990) and established some results suggesting that this notion fts the needs of the study of generalized intuitionistic connectives in the second-order setting (Pistone and Tranchini, 2021).

The key idea underlying this extension of -equivalence can be informally introduced starting from the statement of the proof-conditions of a formula of the form ∀. in the style of the BHK-clauses:

A proof of ∀. (henceforth a *universal proof*) is a function that applied to a proposition yields a proof [/].

As in the case of a proof of ⊃ (taken to be a function from proofs of to proofs of ) a universal proof cannot be merely understood as an infnite list of ordered pairs (each consisting of a proposition and of a proof of [/]) on pain of making the notion of proof epistemically unsurveyable, and thereby contradicting the assumptions that intuitionistic proofs are the result of an activity of mental construction performed by a knowing subject. Rather, these functions have to be understood as given in such a way as to make it possible for a knowing subject to grasp them. A way of meeting this demand is that of assuming that a universal proof of ∀. is a function that associates a proof of [/] to each proposition "in a uniform way".9

To ask that the values of a universal proof are assigned to its arguments in a uniform way amounts to requiring that the values of themselves must be defned in a uniform manner. We take this to mean that in defning each value of (i.e., the proofs of [/]) no knowledge about the argument of (i.e., the propositions ) should be assumed (for a more thorough discussion, see Pistone, 2018).

Consider for example the proposition ∀. ⊃ . A proof of this proposition is a function mapping each proposition onto a proof of ⊃ , which in turn is a function from the set of proofs of onto itself.10 It is true that there may be diferent ways of mapping the set of proofs of a certain proposition onto itself. However, there does not seem to be many options for defning such a map without assuming any knowledge about the proposition and hence, about its sets of proofs. In fact it seems that the only function one can come up with is the identity function. In other words, if we assume the proofs of ∀. ⊃ to be uniform functions, there seems to be only one such proof, namely the one associating to each proposition the identity function id on the set of proofs of . Although this informal argument is non-conclusive, being based on considerations of a heuristic nature, it turns out that the strengthening of -equivalence considered by the authors captures exactly these intuitions.

That the assumption of uniformity has consequences for identity of proofs is not as surprising as it may appear at frst. Consider proofs of propositions of the form

<sup>9</sup> The notion of uniformity has been widely investigated in theoretical computer science under the name "parametricity" (Strachey, 1967; Reynolds, 1983; Hermida, Reddy, and Robinson, 2014) and it is in direct line of descent from the "schematic" (as opposed to the "numerical") interpretation of second-order quantifcation (see, e.g., Carnap, 1931).

<sup>10</sup> Following the common notation in -calculus, we indicate the application of a function to its argument with ( ), where outermost parentheses will be dropped and application is assumed to be left associative, so that ℎ is short for ( ( )ℎ).

(∀.) ⊃ , i.e., functions from proofs of ∀. to proofs of . To assume that universal proofs are uniform functions means that one is restricting the domain of the proofs of (∀.) ⊃ . If two such proofs assign the same value when taking an arbitrary, but uniform, universal proof as argument, then they denote the same proof under the assumption that all universal proofs are uniform. Yet, it might still be possible that these two proofs of ∀. ⊃ assign a diferent value to some (non-uniform) proof of ∀., so that they would no more denote the same function without the assumption.11

To see how the assumption of uniformity can be used to justify new equations between proofs, consider for example the following two derivations:

$$\begin{array}{ccccc} & \frac{\forall X.X \supset X}{\begin{array}{c} \frac{B \supset B}{B} \supset B \end{array}} & \begin{array}{c} \\ \frac{\forall X.X \supset X}{\begin{array}{c} \frac{C \supset C}{C} \end{array}} \end{array} & \begin{array}{c} \begin{array}{c} \frac{B \supset C}{C} \end{array} \\ \hline \end{array} \end{array}$$

On the assumption that the proofs of ∀. ⊃ are uniform, the two derivations should denote the same proof. This is best appreciated when the derivations are decorated with proofs terms:

$$\frac{h:B\supset C}{h\colon B\supset C} \quad \frac{\frac{f\colon \forall X.X\supset B}{f\mathcal{B}:B\supset B} \quad b:B}{h\colon \mathcal{B}b:B} \quad \quad \frac{f\colon \forall X.X\supset X}{\frac{f\mathcal{C}:\mathcal{C}\supset\mathcal{C}}{f\mathcal{C}:\mathcal{C}\supset\mathcal{C}} \quad h b:\mathcal{C}}{f\mathcal{C}(h b):\mathcal{C}}$$

Uniformity warrants that and are the identity functions id and id on the sets of proofs of and respectively, and thus the two derivation encode (for any , ℎ, , and ) the same proof of :

$$h(fBb) = h(\operatorname{id}\_B b) = hb = \operatorname{id}\_C(hb) = fC(hb)$$

It easy to see that -equivalence fails to capture the consequences of uniformity. The two derivations above are -normal and thus (as a consequence of the Church-Rosser theorem for -reduction in NI<sup>2</sup> ) they belong to two distinct -equivalence classes. Hence in order to capture the uniformity of the proofs of ∀. ⊃ we need to strengthen the equivalence relation on derivations by requiring it to be closed under the following scheme, the instances of which will be referred to as -equations (observe that the left-to-right orientation of these equations can be seen as an operation that permutes-up the derivation ′ across the application of ∀E):

<sup>11</sup> It may also be worth observing that the restriction to uniform proofs does not require to modify in any way the rules for the second order quantifer in NI<sup>2</sup> : in fact, the variable condition on the rule ∀I (that ensures that when inferring ∀. from no assumption is made on ) can be seen as a syntactic counterpart of the uniformity requirement informally described above. In other words, all NI<sup>2</sup> -derivations of formulas of the form ∀. actually denote uniform universal proofs. Moreover, it is well-known that extensions of the syntax of NI<sup>2</sup> with non-uniform constructors, although possible, might lead to inconsistencies (see, e.g., Harper and Mitchell, 1999).

Intensional Harmony as Isomorphism 331

$$\begin{array}{ccccc} \stackrel{\widetilde{\mathcal{B}}}{\vee} & \stackrel{\widetilde{\mathcal{B}}}{\supset} & \stackrel{\widetilde{\mathcal{B}}}{\vee} & \stackrel{\widetilde{\mathcal{B}}}{\vee} & \stackrel{\widetilde{\mathcal{B}}}{\vee} \\ \hline \stackrel{\widetilde{B}}{\vee} & \stackrel{\widetilde{B}}{\vee} & & \stackrel{\widetilde{C}\supset\mathcal{C}}{\vee} & \stackrel{\widetilde{C}}{\vee} \\ \stackrel{\widetilde{\mathcal{B}}'}{\mathcal{C}} & & & & \mathcal{C} \end{array}$$

Hence, the -equations are justifed by the assumption that the only proof of ∀. ⊃ is the function associating to each proposition the identity function on its sets of proofs. Conversely, in every categorial model of NI<sup>2</sup> in which the -equations are satisfed, ∀. ⊃ has exactly one proof.

Analogous informal considerations show that the set of uniform proofs of ∀.( ⊃ ) ⊃ must be in bijection with the set of proofs of itself (provided does not occur free in ). In particular, a proof of ∀.( ⊃ ) ⊃ associates to each proposition a function from proofs of ( ⊃ ) (which in turn are functions from proofs of into proofs of ) to proofs of . But the only way to defne such a proof in a uniform manner consists in taking a proof of (if any is available) and associate to each proposition the function that maps each proof of ⊃ onto .

Syntactically, uniformity can be again expressed as the possibility of permuting-up a derivation across an application of ∀E with premise ∀.( ⊃ ) ⊃ using the -equations obtained from the scheme below. In this case observe that the derivation ′ cannot be permuted as it stands, on pain of changing the open assumptions of the derivation, and due to the mismatch between the conclusion of ′ and the minor premise required to apply ⊃E. The mismatch can however be resolved by "surrounding" ′ (whose conclusion is and whose undischarged assumptions are and possibly further assumptions Δ) with some applications of elimination and introduction rules yielding a derivation, that we indicate with ( ⊃ ){′ }, of ( ⊃ ) [/] from ( ⊃ ) [/], Δ:

$$\begin{array}{c} \stackrel{\scriptstyle \mathcal{B}}{\scriptstyle \forall X.(A \supset X) \supset X} \\ \hline \overline{((A \supset X) \supset X)[C/X]} \ (A \supset X)[C/X] \\ \hline \underline{C} \\ \underline{\mathcal{B}'} \\ \underline{D} \end{array} \stackrel{\scriptstyle \mathcal{B}}{\equiv} \begin{array}{c} \stackrel{\scriptstyle \mathcal{B}}{\scriptstyle \exists X.(A \supset X) \supset X} \\ \hline \overline{((A \supset X) \supset X)[D/X]} \ \overline{(A \supset X)[D/X]} \\ \hline D \end{array}$$

where

$$(A \supset X)\{\bigotimes'\} = \begin{array}{c} (A \supset X)\{C/X\} & \stackrel{n}{A} \\ \hline & C \\ \bigotimes & D \\ (n) \; \frac{D}{(A \supset X)\{D/X\}} & \end{array}$$

Since the uniform proofs of ∀.( ⊃ ) ⊃ are in bijection with those of , it should be possible, syntactically, to show that ∀.(⊃) ⊃ and are -isomorphic on any notion of equivalence that is strong enough to encode the uniformity of universal proofs. This is actually the case: taken the two derivations

$$\begin{array}{rcl} \exists \partial\_1 &=& (n) \frac{\frac{A \supset X \quad A}{X}}{(A \supset X) \supset X} \\ & \frac{\forall X.(A \supset X) \supset X}{\forall X.(A \supset X) \supset X} \end{array} \qquad \begin{array}{rcl} \frac{\forall X.(A \supset X) \supset X}{(A \supset A) \supset A} & (n) \frac{\frac{n}{A}}{A \supset A} = & \mathcal{Q}\_2 \end{array}$$

it is easy to show that the composition <sup>1</sup> ∀.( ⊃ ) ⊃ <sup>2</sup> -reduces to the derivation <sup>2</sup>

consisting only of the assumption of and that the composition <sup>1</sup> , after the application of an -permutation, -reduces to the derivation consisting only of the assumption of ∀.( ⊃ ) ⊃ .

In a similar way, we can defne -permutations for the formula ∀.( ⊃ ) ⊃ ( ( ⊃ ) ⊃ ) (provided does not occur free in and ) encoding the uniformity of the proofs of this proposition. On the one hand, this formula is -isomorphic to ∀.( ( ⊃ ) ∧ ( ⊃ )) ⊃ , and on the other hand, using the -permutation we can show that ∀.( ⊃ ) ⊃ ( ⊃ ) ⊃ ≃ ∨ . Hence, we have that ∨ ≃ ∀.( ( ⊃ ) ∧ ( ⊃ )) ⊃ , that is, that by defning strong harmony using -isomorphism, the standard rules for ∨ qualify as strongly harmonious.

More in general, we can establish that:

$$\left(\bigvee\_{j=1}^{m}\bigwedge\_{i=1}^{n\_j}\mathcal{R}\_{fi}\right)\stackrel{\varepsilon}{\simeq}\,\forall X.\left(\bigwedge\_{j=1}^{m}\left(\bigwedge\_{i=1}^{n\_j}\mathcal{R}\_{fi}\supset X\right)\right)\supset X^{\varepsilon}$$

and hence that any collection of introduction rules and its Prawitz–Schroeder-Heister collection of elimination rules are in strong harmony, by defning -permutations for all quantifed formulas ∀. in which has a distinctively simple form that we call *nested sp-* (see, for details, Tranchini, Pistone, and Petrolo, 2019). An L2<sup>⊃</sup> -formula is *strictly positive in* if does not occur to the left of ⊃ in , and a nested sp formula is a formula of the form <sup>1</sup> ⊃ (. . . ( ⊃ ) . . .) where each is sp- for all 1 ≤ ≤ . The above isomorphism is established by showing that the right-hand side formula is -isomorphic to the nested sp- formula

$$\forall X. \left( \mathcal{R}\_{11} \supset \left( \dots \left( \mathcal{R}\_{1n\_1} \supset X \right) \dots \right) \right) \supset \left( \dots \left( \left( \mathcal{R}\_{m1} \supset \left( \dots \left( \mathcal{R}\_{mm\_m} \supset X \right) \dots \right) \right) \supset X \right) \dots \right)$$

which in turn is -isomorphic to the left-hand side formula.

Let L2<sup>⊃</sup> be the fragment of L2<sup>⊃</sup> obtained by allowing to prefx a formula with ∀ only if is nested sp-. The content of any introduction and elimination rule of the form we considered above are -isomorphic to formulas in L2<sup>⊃</sup> , and so are the content of collections of introduction and the elimination rules.

Let NI2<sup>⊃</sup> be the restriction of NI<sup>2</sup> to the language L2<sup>⊃</sup> . As shown by the authors (Pistone and Tranchini, 2021), the -equational theory has characteristically strong properties in this fragment, namely it is decidable and the maximum equivalence extending -equivalence.12 Thus, taking the stance of Došen and Widebäck, equivalence (resp. -isomorphism) can be considered as the canonical notion of equivalence (resp. isomorphism) in NI2<sup>⊃</sup> .

These decidability and maximality results are based on the fact that the derivations of NI2<sup>⊃</sup> modulo -equivalence form a category equivalent to that of the derivations of NI modulo -equivalence. This in turns implies that the question of the decidability of isomorphism in NI2<sup>⊃</sup> is equivalent to that of the decidability of isomorphism in NI, which is (as remarked in the previous section) still open.

The foregoing results speak in favor of defning strong harmony using isomorphism rather than - or CE-isomorphism, at least when the form of introduction and elimination rules follows the schemata given above.

Whenever confronted with two collections of introduction and elimination rules for †, we are not in general capable of telling whether they are in harmony (since -isomorphism in NI, and hence -isomorphism in NI2<sup>⊃</sup> is — of today — not known to be decidable), but we can decide whether certain derivations do or do not testify their isomorphism.

#### **5 Concluding remarks**

The account of strong harmony using -isomorphism delivers a satisfactory sharpening of the notion of weak harmony developed by Schroeder-Heister for introduction and elimination rules of the form discussed above.

It is worth stressing, however,that Schroeder-Heister (2014b) considers introduction and elimination rules of a more general form, namely the following:

$$(\text{INTRO}^\*) \qquad\qquad\qquad\forall \overrightarrow{X} \forall \overrightarrow{Y} \left(\land\_{i=1}^n R\_i \supset \dagger(\overrightarrow{X})\right).$$

$$(\mathbf{ELIM^\*}) \qquad \qquad \forall X \forall \overline{Y} \forall \overline{X} \left( \left( \dagger (\overline{X}) \land \wedge\_{i=1}^n R\_i \right) \supset X \right)$$

satisfying the following two conditions:


By dropping the third condition on elimination rules (see footnote 5 above) and allowing nested quantifcation inside introduction rules and elimination rules, these less restricted forms of introduction and elimination rules signifcantly enrich the class of connectives amenable of a characterization in terms of "pure" introduction and elimination rules (i.e., introduction rules in which no connective occurs apart from the one being "defned"). For instance, Schroeder-Heister (2014b) observes that

<sup>12</sup> Moreover, it allows to show that any derivation is equivalent to one in which applications of ∀E are have an atomic witness (Pistone, Tranchini, and Petrolo, 2021).

it is possible to formulate an introduction rule for negation which does not mention ⊥, namely ∀.(∀. ⊃ ) ⊃ †().

This more general form of introduction rules is however much more expressive than that, as it allows for instance to formulate an introduction rule for a zero-place connective:

$$(\forall Y. (Y \supset Y) \land Y \supset Y) \supset \models$$

which is essentially the impredicative encoding of the natural number predicate in NI<sup>2</sup> .

In contrast to what we observed in the case of introduction and elimination of the more restricted form we considered throughout the paper, the content of introduction and elimination rules of this more general form are formulas which cannot be shown to be -isomorphic to formulas in the fragment L2<sup>⊃</sup> .

It is true that the -equations can be formulated for any formula of the form ∀.. However, in contrast to what happens in the restricted fragment so far considered, as soon as one allows for encodings of inductive types the -equational theory is not decidable (Pistone and Tranchini, 2021) and might not even be maximal (this is suggested by the fact that the equivalence between the -equational theory for the fragment of NI2<sup>⊃</sup> containing the encoding of the natural number predicate is related to a (non-maximal) equational theory for Gödel's System T investigated by Okada and Scott, 1999).13

Thus, on this more general understanding of introduction and elimination rules, it may be more appropriate to defne the notion of strong harmony using an equational theory stronger than . Moreover, since the -equational theory is undecidable outside the L2<sup>⊃</sup> language fragment, there are few hopes for a decidable notion of strong harmony when introduction and elimination rules of this more general form are taken into consideration.

We conclude by observing that the notions of weak and strong harmony as defned in the work of Schroeder-Heister and in the present paper are not directly applicable to some prominent examples discussed in the literature, such as the rules for quantumdisjunction14 and rules whose formulation requires frst-order structure (as, e.g., those

<sup>13</sup> Equational theories stronger than for the whole of NI2<sup>⊃</sup> have been studied, among others, by (Longo, Milsted, and Soloviev, 1993).

<sup>14</sup> As to "quantum disjunction" (see footnote 3 above), due to the restriction on its elimination rule, it does not seem that its elimination content is expressible using an NI<sup>2</sup> formula, and hence its rules fall outside the scope of weak (and hence strong) harmony. One could however consider extensions of NI<sup>2</sup> capable of expressing rules with restrictions of the kind displayed by the quantum disjunction elimination rule. The most natural possibility would be that of extending NI<sup>2</sup> with an implication of the kind described by Dummett (1991) (see also Tranchini, 2018), i.e., in which the introduction rule is restricted so that it can be applied only if its premise depends on no other assumption than those to be discharged by the rule. Using ⊃ for this connective, the content of the collection of elimination rule for quantum disjunction would be expressible as ∀.( ( ⊃ ) ∧ ( ⊃ ) ) ⊃ . Perhaps unsurprisingly, this formula is interderivable with ∨ in the envisaged extension of NI<sup>2</sup> and thus the rules of quantum disjunction would qualify as weakly harmonious. The question of strong harmony is harder to address, since it would require the defnition of an appropriate equivalence relation on derivations for the system considered (and this might not be obvious, since no expansion

of the identity predicate). The extension of the notions of weak and strong harmony to a frst-order setting is an interesting topic for further research.

#### **References**


seems to be available for the implication with restricted introduction rule; see Tranchini, 2018). This is however not the only option, and the issue requires further investigation.


Tranchini, L. (2012). Truth from a proof-theoretic perspective. *Topoi* 31, 47–57.


Widebäck, F. (2001). *Identity of proofs*. Stockholm: Almqvist & Wiksell.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **A Note on Synonymy in Proof-Theoretic Semantics**

Heinrich Wansing

**Abstract** The topic of identity of proofs was put on the agenda of general (orstructural) proof theory at an early stage. The relevant question is: When are the diferences between two distinct proofs (understood as linguistic entities, proof fgures) of one and the same formula so inessential that it is justifed to identify the two proofs? The paper addresses another question: When are the diferences between two distinct formulas so inessential that these formulas admit of identical proofs? The question appears to be especially natural if the idea of working with more than one kind of derivations is taken seriously. If a distinction is drawn between proofs and disproofs (or refutations) as primitive entities, it is quite conceivable that a proof of one formula amounts to a disproof of another formula, and vice versa. A notion of inherited identity of derivations is introduced for derivations in a cut-free sequent system for Almukdad and Nelson's constructive paraconsistent logic N4 with strong negation. The notion is obtained by identifying sequent rules the application of which has no efect on the identity of derivations. Then the notion of inherited identity is used to defne a bilateralist notion of synonymy between formulas, which is a relation drawing more fne-grained distinctions between formulas than strong equivalence.

**Key words:** proof-theoretic semantics, synonymy,identity of derivations, constructive logic N4, BHK interpretation

#### **1 Introduction**

Proof-theoretic semantics has largely been focusing on the central semantical notion of validity; cf. Francez (2015) and Schroeder-Heister (2018). But there are other important semantical notions as well, and one of them is the notion of synonymy. In linguistics, a distinction is drawn between total and partial (or relative) synonymy

Heinrich Wansing

Department of Philosophy I, Ruhr University Bochum, Germany, e-mail: Heinrich.Wansing@rub.de

© The Author(s) 2024 339

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_11

of expressions, where the former is synonymy in the very strict sense of sameness of linguistic meaning in all its aspects. The expressions "square" and "equilateral rectangle," for example, are often regarded as being strictly synonymous, whereas the expressions "discover" and "fnd" are only partially synonymous, and so are "new" and "novel" as well as "buy" and "purchase." Whilst some linguists doubt that there exist strict synonyms, it is generally agreed in denotational semantics that synonymy implies sameness of denotation.

It seems that the reasons for assuming only partial instead of strict synonymy in natural languages, i.e., reasons such as diferent connotations associated with only partially synonymous expressions or their belonging to diferent dialects, do not apply to formal languages. The present paper pursues the idea of explicating synonymy between formulas from a certain formal propositional language in terms of identity between derivations in a given sequent-style proof system. Therefore, no distinction will be drawn between strict and partial synonymy. Also, there is no given, independently motivated notion of synonymy that the paper aims to capture.1

#### **2 Identity of derivation trees and synonymy**

There is a research program that bases synonymy of formulas on a notion of identity of proofs. The topic of identity of proofs (or identity of derivations, if proofs are seen as entities represented by derivations)2 was put on the agenda of general (or structural) proof theory by Dag Prawitz already in his seminal paper Prawitz (1971). On p. 237 Prawitz writes:

In the same way as one asks when two formulas defne the same set or two sentences express the same proposition, one asks when two derivations represent the same proof; in other words, one asks for identity criteria for proofs or for a "synonymity" (or equivalence) relation between derivations.

Moreover, Prawitz (1971, p. 257) formulates a conjecture, which he attributes to Per Martin-Löf (on p. 261) and for which he mentions an infuence by ideas of William Tait: "Two derivations represent the same proof if and only if they are equivalent,"3 where equivalence is the refexive, symmetric, and transitive closure of a certain reducibility relation between two natural deduction derivations with the same set of assumptions and the same conclusion. The relevant question then is: When are the

<sup>1</sup> I dedicate this paper to Peter Schroeder-Heister, whom I appreciate for his path-breaking work in proof-theoretic semantics, who some thirty years ago pointed out to me Nuel Belnap's display logic and Franz von Kutschera's work on functional completeness, and who made it possible for me to attend the now legendary conference on logics without structural rules in Tübingen 1989 and various conferences on proof-theoretic semantics.

<sup>2</sup> There is a convention of referring to proofs as semantical counterparts of derivations, but this terminological convention in not always followed. In Harper, Honsell, and Plotkin (1993), for instance, proofs are explicitly listed together with terms and formulas as syntactic entities.

<sup>3</sup> A detailed study and critical discussion of this conjecture understood as saying that derivations represent the same proofs if they are -equal can be found in Rezende de Castro Alves (2019).

diferences between two distinct proofs (understood as linguistic entities, proof fgures or prooftrees) of one and the same formula so inessential that it is justifed to identify the two proofs? Troelstra and Schwichtenberg (2000, p. 27), for example, raise the question "When are two prooftrees to be regarded as identical?" and continue by explaining that "[t]aking the formulas-as-types isomorphism as our guideline, we can say that two prooftrees are the same, if the corresponding terms of simple type theory are the same (modulo renaming bound variables)." The -equal terms of simply typed -calculi à la Church are terms of one and the same type,4 and it seems that a certain question that is in a sense dual to the one concerning identifying syntactically distinct derivations of one and the same formula has received much less attention:

(Q) When are the diferences between two distinct formulas so inessential that these formulas admit of identical proofs?

If there are distinct formulas with identical proofs, i.e., with essentially the same proof, these formulas could be declared to be synonymous.5 One reason why question (Q) has, apparently, not been asked could be that one may expect *structurally diferent* formulas to have distinct proofs that refect the difering syntactical structure of the formulas under consideration. Kosta Došen and Zoran Petrić (2012) point out that Rudolf Carnap envisaged an understanding of synonymy that focuses on the internal structure of formulas or sentences. Carnap (1947, p. 56) explains that

If two sentences are built *in the same way* out of corresponding designators with the same intensions, then we shall say that they have the same intensional structure. We might perhaps also use for this relation the term '*synonymous*', because it is used in a similar sense by other authors (emphasis added).

In order to give an account of the synonymy of syntactic contexts *in which formulas occur*, one would then seem to need a notion of synonymy of formulas. While Carnap considered the internal grammatical structure of formulas, there is also an established theory of isomorphism of propositions in proof theory, isomorphism of types in functional programming, and isomorphism of objects in category theory that focuses on outer structures, such as derivations, in which formulas occur. As highlighted by Roberto di Cosmo (1995, p. 178), the problem of characterizing isomorphic propositions, types, or objects "is really just one problem, as the solution to one of these is a solution to all the others." This problem has been investigated by a number of category theorists, computer scientists working on typed lambda calculi, and proof theorists, see the survey Cosmo (2005), references and further information given there, Danos, Joinet, and Schellinx (2003), where the term "computational isomorphism" is used, and Došen (1997; 2006), Došen and Petrić (2012), Restall (2019), and Tranchini (2021).6 In Došen (1997, p. 306) Kosta Došen introduces the

<sup>4</sup> In Curry-style simply typed -calculi, in general -reduction is not type-preserving; a counterexample can be found in Sørensen and Urzyczyn (2006, p. 384).

<sup>5</sup> This thought calls into doubt, of course, the view that syntactically distinct proofs can be essentially the same proofs only if they are proofs of the very same formula.

<sup>6</sup> An approach to synonymy in a calculus of typed -terms that can be applied to a formal rendering of large fragments of English has been presented by Yiannis Moschovakis (2006), who defnes a notion of referential synonymy between -terms as a relation of identity between meanings understood as abstract algorithms.

notion of isomorphic formulas in a syntactical category-theoretic framework, where the objects are formulas and the arrows are deductions, as follows:

A formula is isomorphic to a formula if and only if there is a deduction from to and a deduction −1 from to such that composed with −1 reduces via normalization to the identity deduction **1** from to and −1 composed with reduces via normalization to the identity deduction **1** from to (these reductions are represented in categories by equations between arrows). That two formulae are isomorphic is equivalent to the assertion that the deductions involving one of them, either as premise or as conclusion, can be extended to deductions where this formula is replaced by the other, the deductions involving the frst formula being in one-one correspondence with the deductions involving the second. Roughly speaking, whatever you can do in deductions with one of these formulae you can do as well with the other.

Došen then notes that isomorphism between formulas is an equivalence relation stronger than mutual derivability as for formula-to-formula derivations in intuitionistic logic, for example,the mutually derivable formulas (∧) and fail to be isomorphic, and explains that it

seems reasonable to suppose that isomorphism analyzes propositional identity, i.e. identity of meaning for propositions: and stand for the same proposition, i.e., means the same as , if and only if is isomorphic to . This way we would base propositional identity upon identity of deductions, which is codifed by equality between arrows.

More recently, Francesca Poggiolesi (2020) has taken up this idea to defne a notion of hyper-isomorphic formulas as formulas that have the same meaning. The main move is to replace in the defnition of isomorphic formulas the deducibility relation by another inferential relation, namely logical grounding, defned by single-conclusion grounding rules that identify the premises as the reasons why the conclusion is true.

Within the tradition of structural proof theory and proof-theoretic semantics, synonymy has also been discussed by Cesare Cozzo (1994), who outlines a translationbased account of synonymy, and by Tiago Rezende de Castro Alves (2019).

There thus exist proof-theoretic accounts of propositional identity, i.e., *synonymy* of formulas.7 Another proof-theoretic road to addressing question (Q) emerges if the idea of working with more than one kind of derivations is taken seriously. If a distinction is drawn between proofs and disproofs (or refutations) as primitive entities, it is quite conceivable that a proof of one formula amounts to a disproof of another formula, and vice versa. In particular, if a proof of a formula is identifed with (amounts to) a disproof of ∼ , the negation of , and a disproof of a formula amounts to a proof of ∼ , then, obviously, for any formula , a proof (disproof) of is a proof (disproof) of ∼∼ .

Once the possibility of identifying proofs and disproofs of certain distinct formulas is recognized, it may be used to defne a notion of synonymy between formulas: Two formulas and are synonymous just in case the following two conditions are satisfed: (i) there is a proof D of from the assumption that is true and a proof D′ of from the assumption that is true, such that D and D′ are identical, and

<sup>7</sup> Note that Göran Sundholm (1999) distinguishes between three notions of identity, namely absolute, criterial, and *propositional identity* (my emphasis), where propositional identity is the propositional function expressed by the binary identity predicate symbol.

(ii) there is a disproof D of from the assumption that is false and a disproof D′ of from the assumption that is false such that D and D′ are identical. The present paper develops this idea, which abstains from an appeal to the composition of deductions, by considering sequent rules that have no efect on the identity of derivations and by defning a notion of inherited identity between derivations. This approach is in the spirit of what, following Rumftt (2000), is called *proof-theoretic bilateralism*, "an approach to meaning taking *denial* as a primitive attitude, on par with *assertion*" Francez (2014, p. 239), see also Drobyshevich (2021) and Kürbis (2019). The idea is much older, however, and can be seen as evolving from work by David Nelson (1949) and, independently, Franz von Kutschera (1969). A treatment of proofs and disproofs on an equal footing has been suggested by Edgar López-Escobar (1972), and we shall turn to it in the next section.

#### **3 The Brouwer-Heyting-Kolmogorov interpretation**

We start our considerations with a discussion of the Brouwer-Heyting-Kolmogorov (BHK) interpretation of the connectives of intuitionistic propositional logic in terms of (canonical) proofs of compound formulas. There exist several versions of the BHK interpretation in the literature. Troelstra (1999, p. 235) presents the BHK interpretation of ∨ (disjunction), → (implication) and ¬ (negation) as follows:


Moreover, he notes that "[o]ne can take the notion of a falsehood or contradiction (⊥) as primitive; then ¬ is the same as → ⊥. A falsehood is simply a statement which cannot have a proof,"8 and he adds that "[t]hese explanations are actually rather vague, since they make use of unexplained primitive notions such as "(constructive) proof" and "construction"." What also might call for further consideration is the notion of a possible proof.

Troelstra and van Dalen (1988, p. 9) present the BHK interpretation diferently:


<sup>8</sup> Note that if ¬ stands for → ⊥, then a proof of ¬ is a construction which transforms any proof of not into a contradiction, e.g. 0 = 1, but into a proof () of ⊥.

which transforms any hypothetical proof of into a proof of a contradiction.

According to this version of the BHK interpretation, a proof of ( ∧ ) is given if and only if (if) a proof of ( ∧ ) is given, and a proof of ( ∨ ) is given if a proof of ( ∨ ) is given. This explication identifes proofs of the distinct formulas (∧ ) and (∧ ) and proofs of the distinct formulas (∨ ) and (∨ ). Another version of the BHK interpretation as presented, for example, by Girard (1989, p. 5f.) is diferent in that respect:

	- a. = 0, and is a proof of , or
	- b. = 1, and is a proof of .

As per this version of the BHK interpretation, a proof of ( ∧ ) difers from a proof of ( ∧ ) and a proof of ( ∨ ) is distinct from a proof of ( ∨ ). However, one and the same construction can be a proof of formulas ( ∨ ) and ( ∨ ). If is a proof of , then (0, ) is a proof of both ( ∨ ) and ( ∨ ), for any formulas and .

Troelstra and van Dalen (1988, p. 24) regard the BHK interpretation as a "natural semantics" for intuitionistic logic, and according to Girard (1989, p. 71) "Heyting's *semantics of proofs*" even is "[o]ne of the greatest ideas in logic." The Brouwer-Heyting-Kolmogorov interpretation can be and has been criticized for its treatment of intuitionistic negation, ¬, see, for example, Wansing (1993, p. 22 f.).10 By clause (H4), a proof of the intuitionistically valid ¬( ∧ ¬) is a construction such that if is a proof of ( ∧ ¬), then () is a proof of ⊥, which does not exist. However, since (∧ ¬) has no proof, any construction whatsoever proves ¬(∧ ¬), but this conception of any construction converting a non-existent object into a non-existent object may be criticized as being non-constructive, which might have led Troelstra and van Dalen to considering hypothetical proofs.

López-Escobar (1972) suggested to supplement the BHK interpretation of positive intuitionistic logic with the primitive notion of *refutation* (or *disproof*) to give an interpretation for negation. As a result, one obtains a semantics for the four-valued paraconsistent constructive logic with strong negation introduced by David Nelson and his often not mentioned co-author, Ahmad Almukdad, and now known as the system N4, see Almukdad and Nelson (1984) and Odintsov (2008). López-Escobar

<sup>9</sup> Girard uses "⇒" instead of "→".

<sup>10</sup> It has been criticized for other reasons as well. Georg Kreisel (1962) suggested to expand the clause for implication by postulating that the construction or function referred to in that clause indeed satisfes the desired property. This *second clause* is highly controversial and later has been excluded from presentations of the BHK interpretation; cf. Dean and Kurokawa (2016). Wagner de Campos Sanz and Thomas Piecha (2014) argue that intuitionistic implicational logic is incomplete with respect to the BHK interpretation.

gives the following disproof interpretation of the intuitionistic connectives ∧, ∨, →, and the strong negation ∼ (notation adjusted):


Whilst the frst three disproof clauses specify the form of (canonical) refutations of conjunctions, disjunctions, and implications, the clause for strong negation is diferent. It specifes that a construction is a refutation of ∼ just in case the very same construction is a proof of . Moreover, López-Escobar requires that a construction proves ∼ if itself refutes (and not if is a proof of → ⊥ and thus has a specifc form).

A fundamental assumption made by López-Escobar is that for no formula there exists a construction that both proves and disproves . If it is assumed that for no formula there exist proofs of both and ∼, then *ex contradictione quodlibet* expressed as a formula, ( ∧ ∼) → , becomes provable. It is well known that the BHK interpretation in its various versions is sound for intuitionistic propositional logic in the following sense: If a formula is provable in intuitionistic propositional logic, then there is a construction that proves . Similarly, N4 is sound with respect to López-Escobar's proof/disproof interpretation: If a formula is provable in N4, then there is a construction that proves . Moreover, however, if a formula ∼ is provable in N4, then there is a construction that disproves .12

*Remark 3.1* Usually the BHK-interpretation is stated in terms of proofs from the empty set of assumptions. López-Escobar (1972, p. 367) considers sequents (derivability statements) {1, . . . , } ⇒ (notation adjusted). A sequent {1, . . . , } ⇒ is valid if there is a construction such that (1, . . . , ) is a proof of whenever 1, . . . , are constructions proving 1, . . . , , respectively. A sequent ∅ ⇒ is valid if there exists a construction that is a proof of .

If derivations from sets or multisets of assumptions (hypotheses) are considered, the following weakening clause for derivations is sound:

if is derivable from {1, . . . , }, then is derivable from {1, . . . , , } for any formula .

A derivation of from {1, . . . , } is a construction that transforms any proofs 1, . . . , of 1, . . . , , respectively, into a proof (1, . . . , ) of . From such a construction one can defne the ( + 1)-place function ′ that for any formula

<sup>11</sup> The clauses for proofs are as one may expect, with the addition that a construction proves ∼ if refutes .

<sup>12</sup> Note that the notion of refutation in the proof/disproof interpretation is to be distinguished from the notion of refutation in the theory of refutation calculi; cf. Goranko, Pulcini, and Skura (2020). The latter theory is concerned with the axiomatization of the non-theorems of given logics.

 maps any proof of and any proofs 1, . . . , of 1, . . . , , respectively, to (1, . . . , ). Therefore, one and the same construction can be a proof of formulas ( → ( → )) and ( → ( → )).

#### **4 Sequent calculi for N4**

The language of Nelson's constructive propositional logic N4 makes use of the connectives → (implication), ∧ (conjunction), ∨ (disjunction), and ∼ (strong negation). In what follows, lower-case letters , , , . . . are used to denote propositional variables, capital letters , , , . . . are used to denote formulas, and Greek capital letters Γ, Δ, . . . are used to represent fnite (possibly empty) multisets of formulas. For a singleton multiset {} we usually write just , and , Γ as well as Γ, (Δ, Γ as well as Γ, Δ) designates the union of the multisets Γ and {} (Δ and Γ). We use ↔ as an abbreviation of ( → ) ∧ ( → ) and ⇔ as an abbreviation of ( ↔ ) ∧ (∼ ↔ ∼), and we refer to ⇔ as the connective of strong equivalence.

A *sequent* is an expression of the form Γ ⇒ . A sequent calculus is a non-empty set containing some axiomatic, initial sequents and rules of the form

$$\begin{array}{c} s\_1 \ \ldots \ \ldots \ s\_n \\ \hline \ s \\ \hline \end{array}$$

where and all (1 ≤ ≤ ) are sequents. Derivations in a sequent calculus are inductively defned a usual. Every instance of an initial sequent is a derivation. Applications of sequent rules to instances of their schematic premise sequents as conclusions of derivations result in a derivation. If there is a derivation of a sequent in a sequent calculus , we say that is provable in and denote this as ⊢ (or just as ⊢ if the sequent calculus in question is clear). If D is a derivation of a sequent we often write D , so that D and D stand for the same derivation.

A rule of inference is *admissible* in a sequent calculus if for any instance

$$\begin{array}{c} \begin{array}{c} s\_1 \\ \hline \hline \hline \hline \end{array} \dots \begin{array}{c} s\_n \\ \hline \end{array} \end{array}$$

of , if ⊢ for all , then ⊢ .

There is a kind of standard sequent calculus for N4 (cf. Kamide and Wansing, 2015; López-Escobar, 1972; Pearce, 1992; Wansing, 1993), which we shall refer to as GN4.

**Defnition 4.1** The initial sequents of GN4 are of the form

$$p \Rightarrow p \qquad \sim p \Rightarrow \sim p$$

for any propositional variable .

The structural inference rules of GN4 are of the form:

A Note on Synonymy in Proof-Theoretic Semantics 347

$$\begin{array}{c} \Gamma \Rightarrow A \quad A, \Sigma \Rightarrow C\\ \hline \Gamma, \Sigma \Rightarrow C \end{array} (\text{cut)} \quad \begin{array}{c} \Gamma \Rightarrow C\\ A, \Gamma \Rightarrow C \end{array} (\text{we}) \quad \begin{array}{c} \text{ $A, A, \Gamma \Rightarrow C$ }\\ \hline A, \Gamma \Rightarrow C \end{array} (\text{co)} .$$

The logical inference rules of GN4 are of the form:

Γ ⇒ , Δ ⇒ → , Γ, Δ ⇒ (→ l) , Γ ⇒ Γ ⇒ → (→ r) , , Γ ⇒ ∧ , Γ ⇒ (∧l) Γ ⇒ Γ ⇒ Γ ⇒ ∧ (∧r) , Γ ⇒ , Γ ⇒ ∨ , Γ ⇒ (∨l) Γ ⇒ Γ ⇒ ∨ (∨r1) Γ ⇒ Γ ⇒ ∨ (∨r2) , Γ ⇒ ∼∼, Γ ⇒ (∼∼l) Γ ⇒ Γ ⇒ ∼∼ (∼∼r) , ∼, Γ ⇒ ∼( → ), Γ ⇒ (∼ → l) Γ ⇒ Γ ⇒ ∼ Γ ⇒ ∼( → ) (∼ → r) ∼, Γ ⇒ ∼, Γ ⇒ ∼( ∧ ), Γ ⇒ (∼ ∧ l) Γ ⇒ ∼ Γ ⇒ ∼( ∧ ) (∼ ∧ r1) Γ ⇒ ∼ Γ ⇒ ∼( ∧ ) (∼ ∧ r2) ∼, ∼, Γ ⇒ ∼( ∨ ), Γ ⇒ (∼ ∨ l) Γ ⇒ ∼ Γ ⇒ ∼ Γ ⇒ ∼( ∨ ) (∼ ∨ r).

*Remark 4.2* A sequent calculus GLJ for positive intuitionistic logic is obtained from GN4 by deleting all initial sequents of the form ∼ ⇒ ∼ for any propositional variable and deleting all the logical inference rules displaying ∼.

*Remark 4.3* In Kamide and Wansing (2012) and Kamide and Wansing (2015) a syntactical embedding of GN4 into GLJ is used to prove that the rule (cut) is admissible in cut-free GN4. As corollaries to that result, it is observed that N4 is decidable, that N4 satisfes the constructible falsity property, namely, that in GN4:

$$\text{if } \vdash \mathcal{Q} \implies \sim (A \land B), \text{ then } \vdash \mathcal{Q} \implies \sim A \text{ or } \vdash \mathcal{Q} \implies \sim B$$

and that N4 is paraconsistent with respect to strong negation, namely, that there exist formulas and such that it is not the case that in GN4, ⊢ , ∼ ⇒ .

The cut-free version of the above sequent calculus for N4 has two features which may be seen as disadvantages: the subformula property fails and the distinction between proofs and disproofs as primitive entities is hidden by being built into the distinction between left and right introduction rules for non-negated and for negated compound formulas. In Kamide and Wansing (2012) and Kamide and Wansing (2015), one can fnd a "subformula sequent calculus" that avoids the frst problem, and a "dual sequent calculus" that highlights the distinction between proofs and disproofs.

For our purposes it is best to combine both systems into a sequent calculus, SN4, that avoids sequent rules for strongly negated compound formulas and makes use of two kinds of sequent arrows, one for proofs, ⇒<sup>+</sup> , and one for disproofs, ⇒<sup>−</sup> . Sequents of SN4 are of the form Γ:Δ ⇒<sup>+</sup> or Γ:Δ ⇒<sup>−</sup> , where is a formula, and Γ and Δ are fnite multisets of formulas. If a sequent Γ:Δ ⇒<sup>+</sup> or Γ:Δ ⇒<sup>−</sup> is provable in a sequent calculus , then is said to be provable, respectively disprovable, in from Γ :Δ. If a formula is (dis)provable in from ∅ : ∅ , then is said to be (dis)provable in .

The sequent system SN4 is merely a notational variant of the subformula sequent calculus for N4. Moreover, sequents of the form:

$$A\_1, \dots, A\_m : B\_1, \dots, B\_n \Rightarrow^+ C \quad \text{respectively} \quad A\_1, \dots, A\_m : B\_1, \dots, B\_n \Rightarrow^- C$$

in SN4 can be understood as sequents of the form:

$$A \sim A\_1, \ldots, \sim A\_m, B\_1, \ldots, B\_n \Rightarrow C \text{ respectively } \sim A\_1, \ldots, \sim A\_m, B\_1, \ldots, B\_n \Rightarrow \sim C$$

in GN4.

**Defnition 4.4** Let ∗ ∈ {+, −}. The initial sequents of SN4 are of the form:

$$\oslash: p \Rightarrow^{\star} p \qquad \quad p: \oslash \Rightarrow^{-} p \, ^{-}$$

for any propositional variable .

The interaction rules of SN4 are of the form:

$$\begin{array}{c} \frac{\Gamma : \Delta \Rightarrow^{-} A}{\Gamma : \Delta \Rightarrow^{+} \sim A} \ (\multimap \tau +) \quad \frac{\Gamma : \Delta \Rightarrow^{+} A}{\Gamma : \Delta \Rightarrow^{-} \sim A} \ (\multimap \tau -)\\\\ \frac{A, \Gamma : \Delta \Rightarrow^{+} C}{\Gamma : \sim A, \Delta \Rightarrow^{+} C} \ (\multimap \tau -) \quad \frac{\Gamma : A, \Delta \Rightarrow^{+} C}{\sim A, \Gamma : \Delta \Rightarrow^{+} C} \ (\multimap \tau -)\\\\ \frac{\Gamma : \Delta \Rightarrow^{+} \sim A}{\Gamma : \Delta \Rightarrow^{-} A} \ (\multimap \tau + i) \quad \frac{\Gamma : \Delta \Rightarrow^{-} \sim A}{\Gamma : \Delta \Rightarrow^{+} A} \ (\multimap \tau - i)\\\\ \frac{\Gamma : \sim A, \Delta \Rightarrow^{+} C}{A, \Gamma : \Delta \Rightarrow^{+} C} \ (\multimap \tau -) \quad \frac{\sim A, \Gamma : \Delta \Rightarrow^{+} C}{\Gamma : A, \Delta \Rightarrow^{+} C} \ (\multimap \tau - i). \end{array}$$

The structural rules of SN4 are of the form:

$$\begin{array}{c} \begin{array}{c} \Gamma\_{1}: \Delta\_{1} \Rightarrow^{-} A \quad A, \Gamma\_{2}: \Delta\_{2} \Rightarrow^{\*} C \\ \hline \Gamma\_{1}, \Gamma\_{2}: \Delta\_{1}, \Delta\_{2} \Rightarrow^{\*} C \end{array} (\text{cut}-) \\\\ \begin{array}{c} \begin{array}{c} \Gamma\_{1}: \Delta\_{1} \Rightarrow^{+} A \quad \Gamma\_{2}: A, \Delta\_{2} \Rightarrow^{\*} C \\ \hline \Gamma\_{1}, \Gamma\_{2}: \Delta\_{1}, \Delta\_{2} \Rightarrow^{\*} C \end{array} (\text{cut}+) \\\\ \begin{array}{c} A, A, \Gamma: \Delta \Rightarrow^{\*} C \\ A, \Gamma: \Delta \Rightarrow^{\*} C \end{array} (\text{co-}) \quad \begin{array}{c} \Gamma: A, A, \Delta \Rightarrow^{\*} C \\ \hline \Gamma: A, \Delta \Rightarrow^{\*} C \end{array} (\text{co+}) \end{array} $$

A Note on Synonymy in Proof-Theoretic Semantics 349

$$\frac{\Gamma: \Delta \Rightarrow^\* C}{A, \Gamma: \Delta \Rightarrow^\* C} \ (\text{we-}) \quad \frac{\Gamma: \Delta \Rightarrow^\* C}{\Gamma: A, \Delta \Rightarrow^\* C} \ (\text{we+})\dots$$

The positive inference rules of SN4 are of the form:13

$$\begin{array}{c} \frac{\Gamma\_{1}:\Delta\_{1}\Rightarrow^{\star}A \quad \Gamma\_{2}:B,\Delta\_{2}\Rightarrow^{\star}C}{\Gamma\_{1},\Gamma\_{2}:A\to B,\Delta\_{1},\Delta\_{2}\Rightarrow^{\star}C} \ (\rightarrow\mathsf{l}+) \quad \frac{\Gamma\vdash A,\Delta\Rightarrow^{\star}B}{\Gamma:\Delta\Rightarrow^{\star}A\to B} \ (\rightarrow\mathsf{r}+)\\\\ \frac{\Gamma\vdash:A,\Delta\Rightarrow^{\star}C}{\Gamma:A\land B,\Delta\Rightarrow^{\star}C} \ (\wedge\mathsf{l}\mathsf{l}+) \quad \frac{\Gamma\vdash:B,\Delta\Rightarrow^{\star}C}{\Gamma:A\land B,\Delta\Rightarrow^{\star}C} \ (\wedge\mathsf{l}\mathsf{l}+)\\\\ \frac{\Gamma\vdash:\Delta\Rightarrow^{\star}A \quad \Gamma\vdash:\Delta\Rightarrow^{\star}B}{\Gamma:\Delta\Rightarrow^{\star}A\land B} \ (\wedge\mathsf{r}+) \quad \frac{\Gamma\vdash:A,\Delta\Rightarrow^{\star}C \quad \Gamma\vdash:B,\Delta\Rightarrow^{\star}C}{\Gamma:A\lor B,\Delta\Rightarrow^{\star}C} \ (\vee\mathsf{l}+)\\\\ \frac{\Gamma\vdash:\Delta\Rightarrow^{\star}A}{\Gamma:\Delta\Rightarrow^{\star}A\lor B} \ (\vee\mathsf{r}\mathsf{l}+) \quad \frac{\Gamma\vdash:\Delta\Rightarrow^{\star}B}{\Gamma:\Delta\Rightarrow^{\star}A\lor B} \ (\vee\mathsf{r}\mathsf{l}+). \end{array}$$

The negative inference rules of SN4 are of the form:

, Γ : , Δ ⇒<sup>∗</sup> → , Γ : Δ ⇒<sup>∗</sup> (→ <sup>l</sup>−) <sup>Γ</sup><sup>1</sup> : <sup>Δ</sup><sup>1</sup> <sup>⇒</sup><sup>−</sup> <sup>Γ</sup><sup>2</sup> : <sup>Δ</sup><sup>2</sup> <sup>⇒</sup><sup>+</sup> Γ1, Γ<sup>2</sup> : Δ1, Δ<sup>2</sup> ⇒<sup>−</sup> → (→ r−) , Γ : Δ ⇒<sup>∗</sup> , Γ : Δ ⇒<sup>∗</sup> ∧ , Γ : Δ ⇒<sup>∗</sup> (∧l−) Γ : Δ ⇒<sup>−</sup> Γ : Δ ⇒<sup>−</sup> ∧ (∧r1−) <sup>Γ</sup> : <sup>Δ</sup>⇒<sup>−</sup> Γ : Δ ⇒<sup>−</sup> ∧ (∧r2−) , Γ : Δ ⇒<sup>∗</sup> ∨ , Γ : Δ ⇒<sup>∗</sup> (∨l1−) , <sup>Γ</sup> : <sup>Δ</sup> <sup>⇒</sup><sup>∗</sup> ∨ , Γ : Δ ⇒<sup>∗</sup> (∨l2−) Γ : Δ ⇒<sup>−</sup> Γ : Δ ⇒<sup>−</sup> Γ : Δ ⇒<sup>−</sup> ∨ (∨r−).

The single-premise interaction rules are sequent rules that are taken to be rules the application of which has no efect on the identity of derivations.

**Proposition 4.5** *In* SN4*, for any formula ,* ⊢ ∅ : ⇒<sup>+</sup> *and* ⊢ : ∅ ⇒<sup>−</sup> *.*

Derivations ending in a sequent of the form Γ : Δ ⇒<sup>+</sup> are said to be proofs, and derivations ending in a sequent of the form Γ : Δ ⇒<sup>−</sup> are said to be disproofs. The following theorem is proved by simultaneous induction for the cases (a) and (b), i.e., for the frst claim by simultaneous induction on the height of proofs and disproofs in SN4, and for the second claim by induction on the height of proofs GN4−{(cut)}, and justifes the reading of sequents in SN4 mentioned above. We let ∼Γ stand for the multiset of all formulas ∼ with in Γ if Γ is not empty, otherwise ∼Γ is the empty multiset.

<sup>13</sup> The rules (∧l1+) and (∧l2+) do not follow the pattern of the single left introduction rule of ∧ in GN4 simply for uniformity with the presentation in Kamide and Wansing (2015). Similarly for the rules (∨l1− ) and (∨l2− ) and the left introduction rule for negated disjunctions in GN4.

**Theorem 4.6** *Let* Γ *and* Δ *be fnite multisets of formulas, and be a formula.*

*1. (a) If* SN4 ⊢ Γ : Δ ⇒<sup>+</sup> *, then* GN4 ⊢ ∼Γ, Δ ⇒ *and (b) if* SN4 ⊢ Γ : Δ ⇒<sup>−</sup> *, then* GN4 ⊢ ∼Γ, Δ ⇒ ∼*. 2. (a) If* GN4−{(cut)} ⊢ ∼Γ, Δ ⇒ *, then* SN4−{(cut−), (cut+)} ⊢ Γ : Δ ⇒<sup>+</sup> *and (b) if* GN4−{(cut)} ⊢ ∼Γ, Δ ⇒ ∼*, then* SN4−{(cut−), (cut+)} ⊢ Γ : Δ ⇒<sup>−</sup> *.*

*Proof* To illustrate the need for a simultaneous induction, we consider one case for item 2. (b). Suppose the last step in the derivation of ∼Γ, Δ ⇒ is an application of (∼ → r), and has the form ∼( → ):

$$
\frac{\Gamma \Rightarrow A \quad \Gamma \Rightarrow \sim B}{\Gamma \Rightarrow \sim (A \to B) .}
$$

Then, by the induction hypotheses for both (a) and (b) we obtain the following derivation, where the dots indicate applications of (co+):

$$\frac{\begin{array}{c} \mathcal{Q}: \Gamma \Rightarrow^{+} A \quad \mathcal{Q}: \Gamma \Rightarrow^{-} B \\ \hline \mathcal{Q}: \Gamma, \Gamma \Rightarrow^{-} (A \to B) \end{array}}{\begin{array}{c} \vdots \\ \hline \end{array}}$$

$$\begin{array}{c} \mathcal{Q}: \Gamma \Rightarrow^{-} (A \to B). \end{array} \begin{array}{c} \Box \end{array}$$

**Theorem 4.7** *The rules* (cut+) *and* (cut−) *are admissible in cut-free* SN4 (*i.e., in* SN4 −{(cut−), (cut+)})*.*

*Proof* Suppose that a sequent Γ : Δ ⇒<sup>+</sup> , respectively Γ : Δ ⇒<sup>−</sup> , is provable in SN4. Then ∼Γ, Δ ⇒ , respectively ∼Γ, Δ ⇒ ∼, is provable in GN4 by Theorem 4.6(1), and hence the sequent ∼Γ, Δ ⇒ , respectively ∼Γ, Δ ⇒ ∼, is provable in cut-free GN4 by the cut-elimination theorem for GN4. Therefore Γ : Δ ⇒<sup>+</sup> , respectively Γ : Δ ⇒<sup>−</sup> , is provable in cut-free SN4 by Theorem 4.6 (2). □

**Theorem 4.8** *Let* Γ *and* Δ *be fnite multisets of formulas, and be a formula, and let* SN4<sup>−</sup> *stand for* SN4−{(cut−), (cut+)*,* (∼r + )*,* (∼r − )*,* (∼l + )*,* (∼l − )}*.*

*(a) If* GN4−{(cut)} ⊢ ∼Γ, Δ ⇒ *, then* SN4<sup>−</sup> ⊢ Γ : Δ ⇒<sup>+</sup> *and (b) if* GN4−{(cut)} ⊢ ∼Γ, Δ ⇒ ∼*, then* SN4<sup>−</sup> ⊢ Γ : Δ ⇒<sup>−</sup> *.*

*Proof* By simultaneous induction on the height of derivations in GN4−{(cut)} for (a) and (b). □

**Theorem 4.9** *The subformula spoiling rules* (∼r + )*,* (∼r − )*,* (∼l + )*, and* (∼l − ) *are admissible in* SN4<sup>−</sup> *.*

*Proof* Suppose that a sequent Γ : Δ ⇒<sup>+</sup> , respectively Γ : Δ ⇒<sup>−</sup> , is provable in cut-free SN4. Then ∼Γ, Δ ⇒ , respectively ∼Γ, Δ ⇒ ∼, is provable in GN4 by Theorem 4.6(1), and hence the sequent ∼Γ, Δ ⇒ , respectively ∼Γ, Δ ⇒ ∼, is provable in cut-free GN4 by the cut-elimination theorem for GN4. Therefore Γ : Δ ⇒<sup>+</sup> , respectively Γ : Δ ⇒<sup>−</sup> , is provable in SN4<sup>−</sup> by Theorem 4.8. □

**Corollary 4.10** SN4 has the subformula property, i.e., if a sequent is provable (disprovable) in SN4, then there is a proof (disproof) of such that all formulas appearing in that proof (disproof) are subformulas of some formula in .

#### **5 Synonymy in cut-free** SN4

In view of Theorem 4.7, we focus on cut-free SN4 and defne a notion of identity between derivations in cut-free SN4. We want to have a notion of identity of derivations such that, for example, there is a proof D of (∼ ∧ ∼) from ∼( ∨ ) and a proof D′ of ∼(∨) from (∼∧∼) such that D and D′ are identical because according to López-Escobar's proof/disproof interpretation, a proof of (∼ ∧ ∼) is an ordered pair consisting of a proof of ∼ as the frst component and a proof of ∼ as the second component. This pair is a pair consisting of a disproof of and a disproof of , which then is a disproof of ( ∨ ), and therefore it is also a proof of ∼( ∨ ).

If D and D′ are derivations in cut-free SN4, we shall write D ≡ D′ to express that D and D′ are syntactically (that is, as strings of symbols) identical.

**Defnition 5.1** The relation ≈ of inherited identity (in-identity) between derivations D<sup>1</sup> and D<sup>2</sup> in cut-free SN4 is defned inductively. It is the smallest binary relation on the set of derivations in cut-free SN4 such that:

1. D<sup>1</sup> ≈ D<sup>2</sup> if D<sup>1</sup> ≡ D2. 2. <sup>D</sup><sup>1</sup> ≈ D<sup>2</sup> if either <sup>D</sup><sup>1</sup> ≈ D and <sup>D</sup><sup>2</sup> <sup>≡</sup> <sup>D</sup> or <sup>D</sup><sup>2</sup> ≈ D and <sup>D</sup><sup>1</sup> <sup>≡</sup> <sup>D</sup> , where is obtained from D by an application of an (instance of an ) interaction rule.

$$\mathcal{D}.\mathcal{D}\_1 \approx \mathcal{D}\_2 \text{ if } \mathcal{D}\_1 \equiv \frac{\mathcal{D}\_1^1 \dots \mathcal{D}\_n^1}{s\_1}, \mathcal{D}\_2 \equiv \frac{\mathcal{D}\_1^2 \dots \mathcal{D}\_n^2}{s\_2}, \text{and } \mathcal{D}\_i^1 \approx \mathcal{D}\_i^2 \ (1 \le i \le n \le 2).$$

**Proposition 5.2** *The relation* ≈ *is an equivalence relation.*

*Proof* Refexivity of ≈ follows by Defnition 5.1(1). Symmetry: If D<sup>1</sup> ≈ D<sup>2</sup> by Defnition 5.1(1), then D<sup>2</sup> ≈ D<sup>1</sup> follows by Defnition 5.1(1). If D<sup>1</sup> ≈ D<sup>2</sup> by Defnition 5.1(2), then D<sup>2</sup> ≈ D<sup>1</sup> follows by Defnition 5.1(2) since by the induction hypothesis, either D ≈ D<sup>1</sup> or D ≈ D2. If D<sup>1</sup> ≈ D<sup>2</sup> by Defnition 5.1(3), then D<sup>2</sup> ≈ D<sup>1</sup> follows by Defnition 5.1(3) because by the induction hypothesis, D<sup>1</sup> ≈ D<sup>2</sup> implies D<sup>2</sup> ≈ D<sup>1</sup> . Transitivity: There are nine cases, viz.:

$$\begin{array}{llll} [1]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(1)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(1)}{\approx} \mathcal{D}\_{3} & [6]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(3)}{\approx} \mathcal{D}\_{3} \\ [2]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(1)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{3} & [7]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(3)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(1)}{\approx} \mathcal{D}\_{3} \\ [3]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(1)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(1)}{\approx} \mathcal{D}\_{3} & [8]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(3)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{3} \\ [4]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{3} & [9]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(3)}{\approx} \mathcal{D}\_{2} \stackrel{\mathcal{S},1(3)}{\approx} \mathcal{D}\_{3} \\ [5]\ \mathcal{D}\_{1} \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{2} & \stackrel{\mathcal{S},1(2)}{\approx} \mathcal{D}\_{3} \end{array}$$

.

In cases [1], [2], [3], [4], and [7], D<sup>1</sup> ≈ D<sup>3</sup> holds by clause 5.1(1). In cases [5] and [9], D<sup>1</sup> ≈ D<sup>3</sup> holds by the induction hypothesis and clause 5.1(2), respectively 5.1(3). Case [6]: D<sup>2</sup> and D<sup>3</sup> have the form

$$\frac{\mathcal{D}\_1^2 \dots \mathcal{D}\_n^2}{s\_2} \quad \text{respectively} \quad \frac{\mathcal{D}\_1^3 \dots \mathcal{D}\_n^3}{s\_3}$$

Since D<sup>1</sup> 5.1(2) ≈ D2, D<sup>1</sup> , D<sup>2</sup> , D<sup>3</sup> have the form

$$\frac{\mathcal{D}'}{\frac{\mathcal{S}'}{\mathcal{S}\_1}} \stackrel{\mathcal{D}''}{\frac{\mathcal{S}''}{\mathcal{S}\_2}} \stackrel{\mathcal{D}'''}{\frac{\mathcal{S}''}{\mathcal{S}\_3}}$$

respectively, and <sup>D</sup>′ ′ ≈ D′′ ′′ ≈ D′′ ′′ . By the induction hypothesis and clause (3), D<sup>1</sup> ≈ D3. Case [8] is analogous. □

*Remark 5.3* We are working with a variant of the standard sequent calculus for N4 with (co+), (co-), (we+), and (we-) as primitive rules and not with a variant of a G3-style version for which not only cut but also weakening and contraction are admissible.14 Diferent notions of in-identity can be obtained by excluding some structural rules from being applied in the last derivation step of D<sup>1</sup> and D<sup>2</sup> in the third clause of Defnition 5.1. Moreover, the third clause of Defnition 5.1 allows one to identify derivations such as

$$\frac{\mathcal{D}}{\Gamma : \Delta \Rightarrow^{+} A} \quad \text{and} \quad \frac{\mathcal{D}}{\Gamma : \Delta \Rightarrow^{+} A}$$

which is in accordance with the BHK interpretation allowing for identical proofs of ( ∨ ) and ( ∨ ).

*Remark 5.4* Note that not any two cut-free derivations D<sup>1</sup> and D<sup>2</sup> of a formula are in-identical. There are, e.g., syntactically distinct cut-free derivations of the sequent ∅ : ∅ ⇒<sup>+</sup> ( ∧ ) → ( ∨ ) that are not in-identical.

*Example 5.5* Let <sup>D</sup><sup>1</sup> Γ : Δ ⇒<sup>+</sup> ∼ and <sup>D</sup><sup>2</sup> Γ : Δ ⇒<sup>+</sup> ∼ be derivations in cut-free SN4. Then

$$\Gamma : \Lambda \stackrel{\mathcal{D}\_1}{\Rightarrow} \lnot \stackrel{\mathcal{D}\_1}{\sim} A \stackrel{\mathcal{D}\_1}{\begin{array}{c} \Gamma : \Lambda \stackrel{\ast}{\Rightarrow} \stackrel{\star}{\sim} A \\ \Gamma : \Lambda \stackrel{\ast}{\Rightarrow} \stackrel{\ast}{\neg} A \end{array}} \text{ and } \begin{array}{c} \mathcal{D}\_2 \\ \Gamma : \Lambda \stackrel{\ast}{\Rightarrow} \stackrel{\ast}{\sim} B \end{array} \approx \frac{\mathcal{D}\_2}{\begin{array}{c} \Gamma : \Lambda \stackrel{\ast}{\Rightarrow} \stackrel{\ast}{\sim} B \\ \Gamma : \Lambda \stackrel{\ast}{\Rightarrow} \stackrel{\ast}{\Rightarrow} B \end{array}}$$

by Defnition 5.1(2). Hence, by Defnition 5.1(3):

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim A \end{array} \sim A \quad \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array} \sim B \approx \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim A \end{array} \sim B \quad \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} B \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{-} B \end{array}} \cdot \begin{array}{c} \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array} \sim B \end{array}$$

<sup>14</sup> A contraction-free Hudelmaier-Dyckhof-style sequent calculus for N4 can be found in Kamide and Wansing (2012) and Kamide and Wansing (2015).

By Defnition 5.1(2):

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \frac{\Gamma : \Delta \Rightarrow ^{+} \sim A}{\Gamma : \Delta \Rightarrow^{-} A} \end{array}}{\begin{array}{c} \Gamma : \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma : \Delta \Rightarrow^{-} B \end{array}} \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma : \Delta \Rightarrow^{+} \sim A \end{array}}{\begin{array}{c} \Gamma : \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma : \Delta \Rightarrow^{-} B \end{array}} \approx \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma : \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma : \Delta \Rightarrow^{-} B \end{array}}{\begin{array}{c} \Gamma : \Delta \Rightarrow^{-} (A \lor B) \\ \hline \Gamma : \Delta \Rightarrow^{-} (B \lor B) \end{array}}.$$

By transitivity of ≈:

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim A \\ \hline \Gamma: \Delta \Rightarrow^{+} (\sim A \land \sim B) \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} A \end{array}} \sim \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} B \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{-} (A \lor B) \\ \hline \Gamma: \Delta \Rightarrow^{-} B \end{array}}$$

*Remark 5.6* Note that in-identity is not a congruence relation. If D<sup>1</sup> ≈ D2, then, in general, substituting D<sup>1</sup> for D<sup>2</sup> in a derivation tree does not result in a derivation. In the previous example, we have

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim A \\ \hline \Gamma: \Delta \Rightarrow^{+} (\sim A \land \sim B) \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} A \end{array}} \sim \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} B \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} B \end{array}}$$

but whereas

$$\begin{array}{c} \mathcal{D}\_{\mathbb{I}} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim A \\ \hline \Gamma: \Delta \Rightarrow^{-} A \\ \hline \Gamma: \Delta \Rightarrow^{-} (A \lor B) \\ \hline \end{array} \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{-} B \\ \hline \Gamma: \Delta \Rightarrow^{-} B \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim (A \lor B) \\ \hline \end{array}$$

is a derivation in cut-free SN4,

$$\frac{\Gamma: \Delta \Rightarrow^{+} \sim A \quad \Gamma: \Delta \Rightarrow^{+} \sim B}{\Gamma: \Delta \Rightarrow^{+} (\sim A \land \sim B)}$$

is not.

*Example 5.7* This example is very similar to the previous one but maybe it is nevertheless instructive. Let now <sup>D</sup><sup>1</sup> Γ : Δ ⇒<sup>−</sup> and <sup>D</sup><sup>2</sup> Γ : Δ ⇒<sup>−</sup> be derivations in cut-free SN4. Then

$$\Gamma : \Delta \stackrel{\mathcal{D}\_1}{\Longrightarrow} A \stackrel{\mathcal{D}\_1}{\Longrightarrow} \frac{\mathcal{D}\_1}{\Gamma : \Delta \stackrel{\ast}{\Rightarrow} ^- A} \text{ and } \begin{array}{c} \mathcal{D}\_2 \\ \Gamma : \Delta \stackrel{\ast}{\Rightarrow} ^- B \end{array} \simeq \begin{array}{c} \mathcal{D}\_2 \\ \hline \Gamma : \Delta \stackrel{\ast}{\Rightarrow} ^- B \end{array}$$

by Defnition 5.1(2). Also by Defnition 5.1(2):

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma : \Delta \Rightarrow^{-} A \\ \hline \Gamma : \Delta \Rightarrow^{-} (A \lor B) \end{array}}{\begin{array}{c} \Gamma : \Delta \Rightarrow^{-} B \\ \hline \Gamma : \Delta \Rightarrow^{+} (A \lor B) \end{array}} \approx \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma : \Delta \Rightarrow^{-} A \end{array} \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma : \Delta \Rightarrow^{-} B \end{array}}{\begin{array}{c} \Gamma : \Delta \Rightarrow^{-} (A \lor B) \end{array}}$$

By Defnition 5.1(3):

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{-} A \\ \hline \Gamma: \Delta \Rightarrow^{-} (A \lor B) \end{array}}{\begin{array}{c} \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{-} B \end{array} \sim \begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{-} A \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array} \end{array}} \begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{-} B \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array}}{\begin{array}{c} \begin{array}{c} \Gamma: \Delta \Rightarrow^{+} (A \Rightarrow B) \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array} \end{array}$$

By transitivity of ≈:

D<sup>1</sup> Γ : Δ ⇒<sup>−</sup> D<sup>2</sup> Γ : Δ ⇒<sup>−</sup> Γ : Δ ⇒<sup>−</sup> ( ∨ ) Γ : Δ ⇒<sup>+</sup> ∼( ∨ ) ≈ D<sup>1</sup> Γ : Δ ⇒<sup>−</sup> Γ : Δ ⇒<sup>+</sup> ∼ D<sup>2</sup> Γ : Δ ⇒<sup>−</sup> Γ : Δ ⇒<sup>+</sup> ∼ Γ : Δ ⇒<sup>+</sup> (∼ ∧ ∼) .

**Defnition 5.8** Two formulas and are said to be synonymous with respect to cut-free SN4 ( ⇄ in cut-free SN4) if


**Proposition 5.9** *The following pairs of formulas are synonymous with respect to cut-free* SN4*:*

*1. and* ∼∼*, 2.* ( ∧ ) *and* ∼(∼ ∨ ∼)*, 3.* ( ∨ ) *and* ∼(∼ ∧ ∼)*.*

*Proof* 1: We have

$$\begin{array}{llll} \mathcal{\mathcal{O}} : p \Rightarrow^{\star} p & \mathcal{\mathcal{O}} : p \Rightarrow^{\star} p \\ \hline \mathcal{\mathcal{O}} : p \Rightarrow^{\star} \sim p \\ \hline \mathcal{\mathcal{O}} : p \Rightarrow^{\star} \sim p \end{array} \quad \begin{array}{llll} \mathcal{\mathcal{O}} : p \Rightarrow^{\star} p \\ \hline \mathcal{D} : \mathcal{\mathcal{O}} \Rightarrow^{\star} p \\ \hline p : \mathcal{\mathcal{O}} \Rightarrow^{\star} \sim p \\ \hline \end{array} \quad \begin{array}{llll} p : \mathcal{\mathcal{O}} : \Longrightarrow^{-} p \\ \hline \mathcal{\mathcal{O}} : \sim p \Rightarrow^{-} p \\ \hline \sim p : \mathcal{\mathcal{O}} \Rightarrow^{-} p \end{array}$$

2., positive condition: By clauses (1) and (2) of Defnition 5.1, the derivations ∅ : ⇒<sup>+</sup> and ∅ : ⇒<sup>+</sup> ∼ : ∅ ⇒<sup>+</sup> are in-identical, and so are ∅ : ⇒<sup>+</sup> and ∅ : ⇒<sup>+</sup> ∼ : ∅ ⇒<sup>+</sup> . Let <sup>D</sup><sup>1</sup> and <sup>D</sup><sup>2</sup> be the derivations ∅ : ⇒<sup>+</sup> ∅ : ( ∧ ) ⇒<sup>+</sup> ∅ : ( ∧ ) ⇒<sup>−</sup> ∼ respectively ∅ : ⇒<sup>+</sup> ∅ : ( ∧ ) ⇒<sup>+</sup> ∅ : ( ∧ ) ⇒<sup>−</sup> ∼

and let D<sup>3</sup> and D<sup>4</sup> be the derivations

$$\frac{\frac{\mathcal{Q}:p\twoheadrightarrow^{\star}\_{p}p}{\sim p:\,\mathcal{Q}\Rightarrow^{\star}\_{p}p}}{\frac{(\sim p\lor\sim q):\,\mathcal{Q}\Rightarrow^{\star}\_{p}p}{\mathcal{Q}:\,\sim(\sim p\lor\sim q)\Rightarrow^{\star}\_{p}p}}\qquad\text{respectively}\qquad\frac{\frac{\mathcal{Q}:q\Rightarrow^{\star}\_{q}q}{\sim q:\,\mathcal{Q}\Rightarrow^{\star}\_{q}q}}{\frac{(\sim p\lor\sim q):\,\mathcal{Q}\Rightarrow^{\star}\_{q}q}{\mathcal{Q}:\,\sim(\sim p\lor\sim q)\Rightarrow^{\star}\_{q}q}}.$$

By clauses (3) and (2) of Defnition 5.1, D<sup>1</sup> ≈ D<sup>3</sup> and D<sup>2</sup> ≈ D4. Let D<sup>5</sup> and D<sup>6</sup> be the derivations

$$\frac{\mathcal{D}\_1}{\mathcal{Q}:(p\land q)\Rightarrow^-(\sim p\lor\sim q)}\qquad\text{respectively}\qquad\frac{\mathcal{D}\_3}{\mathcal{Q}:\sim(\sim p\lor\sim q)}\stackrel{\mathcal{D}\_4}{\to}\tag{7.6.4}$$

By clause (3) of Defnition 5.1, D<sup>5</sup> ≈ D6, and by clause (2) of Defnition 5.1, D<sup>7</sup> ≈ D6, where D<sup>7</sup> ≡ D<sup>5</sup> ∅ : ( ∧ ) ⇒<sup>+</sup> ∼(∼ ∨ ∼) .

2., negative condition: By clauses (1) and (2) of Defnition 5.1,

$$\frac{p: \mathcal{Q} \Rightarrow^- p}{p: \mathcal{Q} \Rightarrow^+ \sim p} \approx \frac{p: \mathcal{Q} \Rightarrow^- p}{\mathcal{Q}: \sim p \Rightarrow^- p} \text{ and } \frac{q: \mathcal{Q} \Rightarrow^- q}{q: \mathcal{Q} \Rightarrow^+ \sim q} \approx \frac{q: \mathcal{Q} \Rightarrow^- q}{\mathcal{Q}: \sim q \Rightarrow^- q} \text{ } \frac{p}{p}$$

Call these derivations D1,D2,D3, and D4, respectively. By clause (3) of Defnition 5.1, therefore

$$\frac{\mathcal{D}\_{\mathbb{I}}}{p:\mathcal{Q}\Rightarrow^{\star}(\sim p\lor\sim q)}\approx\frac{\mathcal{D}\_{2}}{\mathcal{Q}:\sim p\Rightarrow^{-}(p\land q)}\quad\text{and}$$

$$\frac{\mathcal{D}\_{3}}{q:\mathcal{Q}\Rightarrow^{\star}(\sim p\lor\sim q)}\approx\frac{\mathcal{D}\_{4}}{\mathcal{Q}:\sim q\Rightarrow^{-}(p\land q)}\cdot.$$

Let us call the latter derivations D11, D21, D31, and D41, respectively. Let

$$\mathcal{D}\_6 \equiv \frac{\mathcal{D}\_{11}}{(p \land q) : \mathcal{Q} \Rightarrow^+ (\sim p \lor \sim q)} \text{ and } \mathcal{D}\_7 \equiv \frac{\mathcal{D}\_{21}}{\mathcal{Q} : (\sim p \lor \sim q)} \stackrel{\mathcal{D}\_{41}}{\Rightarrow} (p \land q)$$

By clause (3) of Defnition 5.1, D<sup>6</sup> ≈ D7, and by clause (2) of Defnition 5.1,

$$\frac{\mathcal{D}\_6}{(p \land q) : \mathcal{Q} \Rightarrow^- \sim (\sim p \lor \sim q)} \approx \frac{\mathcal{D}\_7}{\sim (\sim p \lor \sim q) : \mathcal{Q} \Rightarrow^- (p \land q)}$$

3: This case is similar to the previous case. □

**Defnition 5.10** Two formulas and are strongly equivalent in cut-free SN4 if ⊢ ∅;∅ ⇒<sup>+</sup> ⇔ .

It can easily be seen that two formulas and are strongly equivalent in SN4, and hence also in cut-free SN4, just in case ⊢ ∅; ⇒<sup>+</sup> , ⊢ ∅; ⇒<sup>+</sup> , ⊢ : ∅ ⇒<sup>−</sup> , and ⊢ : ∅ ⇒<sup>−</sup> .

**Proposition 5.11** *If* ⇄ *in cut-free* SN4*, then and are strongly equivalent in cut-free* SN4*.*

*Proof* Assume that and are synonymous. By Proposition 4.5, ⊢ : ∅ ⇒<sup>−</sup> , and by synonymy of and , ⊢ : ∅ ⇒<sup>−</sup> . Analogously, we obtain ⊢ ∅; ⇒<sup>+</sup> , ⊢ ∅; ⇒<sup>+</sup> , and ⊢ : ∅ ⇒<sup>−</sup> . □

.

.

**Corollary 5.12** *The following pairs are pairs of formulas that are not synonymous with respect to cut-free* SN4 *because they are not strongly equivalent in cut-free* SN4*:*

$$I. \sim (p \to q) \text{ and } (p \land \sim q),\\2. \ ((p \to p) \to (q \to q)) \text{ and } ((q \to q) \to (p \to p)).$$

By Proposition 5.11, synonymy implies strong equivalence, and one may wonder whether two formulas are synonymous in cut-free SN4 if and only if they are strongly equivalent. Strong equivalence in SN4 does, however, not imply synonymy in cut-free SN4.

#### **Proposition 5.13** *The converse of Proposition 5.11 does not hold.*

*Proof* The following pairs, for instance, are pairs of strongly equivalent formulas that are not in-identical:

$$p \text{ and } (p \lor p), \qquad p \text{ and } (p \land p), \qquad \sim p \text{ and } \sim (p \land p), \qquad \sim p \text{ and } \sim (p \lor p).$$

If D is a derivation in cut-free SN4 with no branching and D′ is a syntactically distinct derivation in cut-free SN4 with a branching, then D and D′ cannot be in-identical. Every cut-free proof of ∅ : ( ∨ ) ⇒<sup>+</sup> is branching, whereas no cut-free proof of ∅ : ⇒<sup>+</sup> ( ∨ ) is branching. Something analogous holds for the other pairs. □

#### **6 Conclusion**

We have introduced a notion of equivalence between derivations, inherited identity, in a cut-free sequent system for Almukdad and Nelson's constructive paraconsistent logic N4 with strong negation. This notion was obtained by considering sequent rules the application of which has no efect on the identity of derivations. Then the notion of in-identity has been used to defne a bilateralist notion of synonymy between formulas, which is a relation drawing more fne-grained distinctions between formulas than strong equivalence.

Note that this approach can also be applied to the connexive variant C of N4. A sequent calculus for C can be obtained from GN4 by replacing the rules for negated implications by the following ones that allow one to prove the sequents ∅ ⇒ ∼( → ) → ( → ∼) and ∅ ⇒ ( → ∼) → ∼( → ); cf. Wansing (2005), Omori and Wansing (2020):

$$\frac{\Gamma \Rightarrow A \quad \sim B, \Delta \Rightarrow \mathcal{C}}{\sim (A \to B), \Gamma, \Delta \Rightarrow \mathcal{C}} \ (\sim \to \text{l})' \quad \frac{A, \Gamma \Rightarrow \sim B}{\Gamma \Rightarrow \sim (A \to B)} \ (\sim \to \text{r})'.$$

I conclude this note by giving a rather speculative glimpse into some further directions for studying the inherited identity of derivations that emerges from a distinction between proofs and refutations and the notion of synonymy based on in-identity.

A Note on Synonymy in Proof-Theoretic Semantics 357

1. It is conspicuous that there is a certain gap in the system of rules of SN4. There is no connective that stands to disjunction with respect to disproofs as conjunction stands to implication with respect to proofs. Closing this gap calls for adding to the language of N4 a binary connective that is in a certain sense dual to implication, namely the co-implication, <sup>−</sup> , from the bi-intuitionistic logic 2Int introduced in Wansing (2016a; 2016b; 2017).<sup>15</sup> Concretely, we can add the following rules for <sup>−</sup> to GN4:

$$\begin{array}{c} \begin{array}{l} \Gamma \Rightarrow \sim A \quad \sim B, \Delta \Rightarrow C\\ \sim (B \multimap A), \Gamma, \Delta \Rightarrow C \end{array} \qquad \begin{array}{l} \sim A, \Gamma \Rightarrow \sim B\\ \hline \Gamma \Rightarrow \sim (B \multimap A) \end{array} \\\\ \begin{array}{l} \sim A, B, \Gamma \Rightarrow C\\ (B \multimap A), \Gamma \Rightarrow C \end{array} \qquad \begin{array}{l} \Gamma \Rightarrow \sim B \quad \sim B\\ \hline \Gamma \Rightarrow \sim A \quad \Gamma \Rightarrow B\\ \hline \Gamma \Rightarrow (B \multimap A) \end{array} \end{array}$$

and the following rules for <sup>−</sup> to SN4:

Γ<sup>1</sup> : Δ<sup>1</sup> ⇒<sup>−</sup> , Γ<sup>2</sup> : Δ<sup>2</sup> ⇒<sup>∗</sup> <sup>Γ</sup>1, <sup>Γ</sup>2, (− ) : <sup>Δ</sup>1, <sup>Δ</sup><sup>2</sup> <sup>⇒</sup><sup>∗</sup> (− <sup>l</sup>−) , <sup>Γ</sup> : <sup>Δ</sup> <sup>⇒</sup><sup>−</sup> Γ : Δ ⇒<sup>−</sup> (− ) (− <sup>r</sup>−) , Γ : , Δ ⇒<sup>∗</sup> <sup>Γ</sup> : (− ), <sup>Δ</sup> <sup>⇒</sup><sup>∗</sup> (− <sup>l</sup>+) <sup>Γ</sup><sup>1</sup> : <sup>Δ</sup><sup>1</sup> <sup>⇒</sup><sup>−</sup> <sup>Γ</sup><sup>2</sup> : <sup>Δ</sup><sup>2</sup> <sup>⇒</sup><sup>+</sup> Γ1, Γ<sup>2</sup> : Δ1, Δ<sup>2</sup> ⇒<sup>+</sup> (− ) (− <sup>r</sup>+).

Consider the addition of <sup>−</sup> to the language of N4. Let us refer to the addition of the former set of sequent rules for <sup>−</sup> to GN4 as GN4− , and let us refer to the result of adding the latter sequent rules for <sup>−</sup> and the following rules for the constants <sup>⊤</sup> and ⊥:

⊤, Γ : Δ ⇒<sup>∗</sup> Γ : ⊥, Δ ⇒<sup>∗</sup> Γ : Δ ⇒<sup>+</sup> ⊤ Γ : Δ ⇒<sup>−</sup> ⊥

to SN4 as SN4− ,⊤,⊥. Moreover, let us refer to the result of deleting the rules displaying strong negation, i.e., the interaction rules of SN4, from SN4− ,⊤,<sup>⊥</sup> as S2Int.

The language of S2Int is the language of the bi-intuitionistic logic 2Int from Wansing (2016a; 2016b; 2017).16 With ⊤ and ⊥ as primitive, one can defne two negations, the intuitionistic negation, ¬, and its counterpart with respect to coimplication, namely co-negation, −, as follows:

$$\neg A \coloneqq (A \to \bot) \quad \neg A \coloneqq (\top \neg \lhd A).$$

One natural question is whether in the absence of ∼, there are interaction rules that allow for an identifcation of proofs and disproofs of distinct formulas from the language of 2Int.

2. If we think of derivation trees from the perspective of model-theoretic semantics, it is natural to wonder about their Fregean denotation and sense. What do they

<sup>15</sup> For the notion of co-implication in Heyting-Brouwer logic see Goré (2000), Goré and Shillito (2020), Rauszer (1980), and Schroeder-Heister (2011) and references given there.

<sup>16</sup> Note that in the notation (Γ; Δ ⊢ ) and (Γ; Δ ⊢ ) from Wansing (2016a; 2016b; 2017), Γ stands for formulas assumed to be true, and Δ stands for formulas assumed to be false.

.

refer to and what do they express (what is their "mode of presentation" (Art des Gegebenseins))? Whatever the denotation and the sense of a derivation is, it seems to be clear that identical derivations ought to have the same sense and hence also the same denotation. There is not much work on the sense of derivations; I am aware of Ayhan (2021) and Tranchini (2016).

In order to make sure that two derivation trees are co-referential if they are encoded by co-referential, uniquely typed -terms, a two-sorted typed -calculus is needed. For a connexive variant of the bi-intuitionistic logic 2Int, a two-sorted typed -calculus has been presented in Wansing (2016b), see also Wansing (2010).

We are then working with two sorts of typed variables, proof variables + , + , + , + 1 , + 2 , + 3 , . . ., and disproof variables − , − , − , − 1 , − 2 , − 3 , . . ., together with the usual term-forming operations and new type-forming operations, <sup>+</sup> and <sup>−</sup> .

The interaction rules of SN4 give rise to the following term assignment rules:


*Example 6.1* (Example 5.7 continued) We noted that

$$\frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{-} A \\ \hline \Gamma: \Delta \Rightarrow^{-} (A \lor B) \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{-} (A \lor B) \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim (A \lor B) \end{array}} \approx \frac{\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{-} A \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{+} \sim B \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array}} \frac{\begin{array}{c} \mathcal{D}\_{2} \\ \hline \Gamma: \Delta \Rightarrow^{-} B \\ \hline \Gamma: \Delta \Rightarrow^{+} \sim B \end{array}}{\begin{array}{c} \Gamma: \Delta \Rightarrow^{+} (\sim A \land \sim B) \end{array}} .$$

If we encode the two in-identical derivations by typed -terms, we obtain:

$$\begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{-} \mathcal{M}^{-} \ \vdots \ A \quad \quad \Gamma: \Delta \Rightarrow^{-} \mathcal{N}^{-} \ \vdots \ B \\\hline \hline \Gamma: \Delta \Rightarrow^{-} \langle \mathcal{M}^{-}, \mathcal{N}^{-} \rangle^{-} \ \vdots \ (A \lor B) \\\hline \hline \Gamma: \Delta \Rightarrow^{+} \langle \mathcal{M}^{-}, \mathcal{N}^{-} \rangle^{-+} \ \vdots \ \sim \langle A \lor B \rangle \\\hline \end{array} \quad \begin{array}{c} \mathcal{D}\_{1} \\ \hline \Gamma: \Delta \Rightarrow^{-} \mathcal{M}^{-} \ \vdots \ A \\\hline \Gamma: \Delta \Rightarrow^{+} \mathcal{N}^{-} \ \vdots \ B \\\hline \Gamma: \Delta \Rightarrow^{+} \mathcal{N}^{-+} \ \vdots \ \sim \mathcal{B} \\\hline \Gamma: \Delta \Rightarrow^{+} \mathcal{N}^{-+} \ \vdots \ \sim \mathcal{B} \\\hline \end{array}$$

We thus need a semantics for the two-sorted typed -calculus such that the equation

$$\langle \langle M^- \:: \dots \, A, N^- \colon \dots \, B \rangle^{-+} \colon \dots \sim (A \vee B) = \langle M^{-+} \:: \dots \, A, N^{-+} \colon \dots \, B \rangle^+ \colon \dots (\sim A \wedge \sim B)$$

is valid.

I conjecture that in order to obtain a sound and complete semantics for such a two-sorted typed -calculus, for every formula (type) , one will have to use of a pair of two separate domains <sup>+</sup> and <sup>−</sup>.

**Acknowledgements** A preliminary and substantially diferent work-in-progress version of this paper was presented at the *Third Tübingen Conference on Proof-Theoretic Semantics*, 27–30 March 2019, and the *Eleventh Smirnov Readings in Logic* at Lomonosov Moscow State University, 19–21 June 2019. I would like to thank Peter Schroeder-Heister and Thomas Piecha for inviting me to PTS III, Greg Restall and Lutz Straßburger for discussions at that conference, and Vladimir Markin for inviting me to the Smirnov Readings 2019. A version of the paper much closer to the above version was presented at *TULIPS – The Utrecht Logic in Progress Series*, October 6, 2020 and at the *Logic and Metaphysics Workshop*, City University of New York, November 2, 2020. I would like to thank Colin Caret and Rosalie Iemhof for inviting me to the *TULIPS* workshop, Graham Priest for inviting me to the *Logic and Metaphysics Workshop*, and Sergei Artemov, Jan Broersen, Melvin Fitting, Rosalie Iemhof, Valeria de Paiva, Graham Priest, and David Ripley for their questions and remarks. Moreover, I am grateful to Sara Ayhan, Tiago Rezende de Castro Alves, Göran Sundholm, and an anonymous referee for their helpful comments.

#### **References**


**Note added in proof.** A sequel to the present paper has been accepted for publication in the *Bulletin of the Section of Logic*, 2023, in a special issue on Bilateralism and Proof-Theoretic Semantics, namely Sara Ayhan and Heinrich Wansing, On synonymy in proof-theoretic semantics. The case of 2Int. This paper contains a proof of cutelimination for a bilateral sequent calculus for 2Int. Note that there in antecedent position of sequents the assumptions taken to be true come frst and the premises taken to be false second; cf. also footnote 16 above.

A type-theory for 2Int with Curry-style typing is presented in chapter 6 of Sara Ayhan, *Meaning and Identity of Proofs in (Bilateralist) Proof-theoretic Semantics*, 2023, submitted as a PhD thesis at Ruhr University Bochum. Moreover, in the paper by Sergei Odintsov, Sergey Drobyshevich and Heinrich Wansing, Moisil's modal logic and related systems, in: K. Bimbó (ed.), *Relevance Logics and other Tools for Reasoning. Essays in Honour of Michael Dunn*, College Publications, London, 2022, 150–177, it is observed that the bi-intuitionistic logic also known as Heyting-Brouwer logic, referred to in footnote 15 above, was already studied by Grigore Moisil in a publication from 1942, *Logique modale*.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Paradoxes, Intuitionism, and Proof-Theoretic Semantics**

Reinhard Kahle and Paulo Guilherme Santos

Wenn wir einen mathematischen Beweis erst am Resultate auf seine Zulässigkeit prüfen können, so brauchen wir überhaupt keinen Beweis.

David Hilbert1

**Abstract** In this note, we review paradoxes like Russell's, the Liar, and Curry's in the context of intuitionistic logic. One may observe that one cannot blame the underlying logic for the paradoxes, but has to take into account the particular concept formations. For proof-theoretic semantics, however, this comes with the challenge to block some forms of direct axiomatizations of the Liar. A proper answer to this challenge might be given by Schroeder-Heister's *defnitional freedom*.

**Key words:** paradoxes, liar, intuitionism, proof-theoretic semantics, defnitional freedom

## **1 Weyl on the Grelling–Nelson paradox**

Kurt Grelling presented in 1908, in a paper published jointly with Leonard Nelson (Grelling and Nelsen, 1908), the now well-known paradox concerning whether or not the adjective "heterologic" is heterologic.

Reinhard Kahle

Paulo Guilherme Santos

© The Author(s) 2024 363 T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_12

Carl Friedrich von WeizsäckerCenter,University of Tübingen,Germany,and CMA,FCT,Universidade Nova de Lisboa, Portugal, e-mail: reinhard.kahle@uni-tuebingen.de

Centro de Matemática e Aplicações, NOVA School of Science and Technology, Universidade Nova de Lisboa, Portugal, e-mail: pgd.santos@campus.fct.unl.pt

<sup>1</sup> Hilbert (1917, p. 135). English translation: "If we can verify the admissibility of a mathematical proof only at the result, we do not need any proof at all."

Hermann Weyl discusses this paradox in some length in the opening section of his "predicative manifesto", *Das Kontinuum* (Weyl, 1918):2

But anyone who forgets that a proposition with such a structure can be meaningless is in danger of becoming trapped in absurdity — as a famous "paradox," essentially due to Russell, shows. Let a word which signifes a property be called *autological* if this word itself possesses the property which it signifes; if it does not possess that property, let it be called *heterological*. For example, the German word "kurz" (meaning "short") is itself *kurz* (i.e., is itself short — for a word in the German language which consists of only four letters will without question have to be described as a short one); hence "kurz" is autological. The word "long," on the other hand, is not itself long and, so, is heterological. Now what about the word "heterological" itself? If it is autological, then it has the property which it expresses and, so, is heterological. If, on the other hand, it is heterological, then it does not have this property and, so, is autological. Formalism regards this as an insoluble contradiction; but in reality this is a matter of scholasticism of the worst sort: for the slightest consideration shows that absolutely no sense can be attached to the question of whether the word "heterological" is itself auto- or heterological.

Weyl's warning, that propositions can be meaningless, can be taken as an indication that one would have to renounce the *tertium non datur* for such propositions. In other words, a logic which admits the formulation of such propositions cannot be classical; one will have to allow "truth values" — if they still can be called this way — beyond or between "true" and "false".

Historically, the natural choice for such a logic appears to be the intuitionistic one, which is distinguished for leaving out the *tertium non datur* from classical logic.3 In the following we see, however, that intuitionistic logic is not much help regarding the paradoxes.

#### **2 Russell's paradox in an intuitionistic setting**

Russell's paradox had a profound impact on the development of modern logic. On the one hand, it forced set theory to reconsider its formal base, resulting eventually in Zermelo's axiomatization which can be taken as the standard set-theoretical foundation today. On the other hand, it questioned the mathematical concept formation, including the very notion of mathematical proof, prompting Hilbert to conceive proof theory as a tool to investigate the foundations of mathematics.

Interestingly, a simple inspection of Russell's paradox shows that it does not depend on the *tertium non datur*; in other words, the proof that the allowance to defne the "Russell set" { | ∉ } leads to a contradiction is carried out by logical reasoning which is intuitionistically valid.4 As far as we know, it was frst put on

<sup>2</sup> The English citation is from Weyl (1987, p. 6f).

<sup>3</sup> Weyl joined Brouwer's intuitionism only shortly after the publication of *Das Kontinuum*. But his slight shift from predicativism to intuitionism does not afect the criticism stated in the paragraph above; rather to the contrary.

<sup>4</sup> See, for instance, Irvine and Deutsch (2016, §4). The frst author learned this observation from Robert Lubarsky.

record by Neil Tennant (1982), see also footnote 6. He refers to Prawitz, who gives a proof which works in minimal logic (Prawitz, 1965, p. 95). At this point, however, Prawitz did not comment on the (weak) logical framework he is using.5

In consequence, this paradox (and, as we will see, others) cannot be resolved by just replacing the underlying classical logic by intuitionistic logic.6

#### **3 The Liar and Curry's paradox**

The Liar paradox — "This sentence is false." — is surely the oldest paradox in scientifc history, and it appears to be obvious that it just contradicts the law of bivalance. In this case, the standard argument uses indeed classical logic, arguing by a case distinction on assuming that it is true or that it is false, both leading to contradictions.

Curry (1942) presented a paradox, now named after him, to simplify the *Kleene– Rosser paradox* which showed the inconsistency of a "-calculus logic" of Church. Curry points out that Kleene and Rosser had used the *Richard paradox*, while his argument is based on Russell or the Liar. In essence, Curry's paradox is based in the defnability of sentences saying "This sentence implies " for any sentence . Only requiring some very simple rules for implication one can obtain, from the defned sentence, . As, in this way, every formula of the system is derivable, the system is inconsistent.

Apparently, this paradox does not even involve negation — but, we will argue in a minute that this is only apparent —, and therefore, it was taken to question our very intuition about implication. In particular, the reasoning used by Curry is clearly intuitionistically valid.

One can replace in Curry's sentence by the *falsum*, ⊥, a propositional constant for a false sentence. The single instances follow then with the intuitionistically valid principle *ex falso quodlibet*. But taking into account that intuitionistic7 negation ¬

<sup>5</sup> A more recent discussion of Prawitz and Tennant's treatment of Russell's paradox can be found in Schroeder-Heister and Tranchini (2017).

<sup>6</sup> See Tennant (1982, p. 268f.): "I shall always try to present my proof-theoretic considerations within *intuitionistic* logic. This should at least allay the suspicion that bivalence or excluded middle or some jaundiced relative is the source of contagion."

<sup>7</sup> See Troelstra and Schwichtenberg (2000, p. 3). Of course, in principle, also classical negation can be defned this way; but while, in intuitionistic logic, negation is directly *reduced* to ⊥, in classical logic, we would have merely a syntactic variation, still requiring axioms or rules involving negated formulas as such — see, for instance, the diference of the absurdity rules ⊥ and ⊥ in Troelstra and Schwichtenberg (2000, p. 37).

can be defned as → ⊥, one observes that Curry's paradox (using ⊥)8 is essentially the Liar in intuitionistic terms.

"At the referee's suggestion" Löb (1955, Fn. 4) remarks that Curry's "paradox is derived without the word 'not' " (Löb, 1955, p. 117). Löb was carefully enough to speak only about the *word* "not". Beall and Murzi (2013, p. 144) rephrase it by saying that Curry's paradox "arises even in negation-free languages". But, as one can see from the defnition of negation in an intuitionistic language without a primitive symbol for negation, the absence of the word does not imply necessarily the absence of the concept of negation. Even if Curry's paradox is still applicable in languages which would not even allow a defnition by use of *falsum*, an implicit representation of negation will be present in any case.9

At the end — and as for Russell — revoking the bivalence is not enough to ban Liar-like paradoxes; rather, it appears that the paradoxes are independent of the underlying logic.

Concerning the given reading of intuitionistic negation we like to recall that a similar perspective was already given by Bernays (1979, p. 4) as a distinctive feature in comparison with classical logic:10

As one knows, the use of the "tertium non datur" in relation to infnite sets, in particular in Arithmetic, was disputed by L. E. J. Brouwer, namely in the form or an opposition of the traditional logical principle of the excluded middle. Against this opposition is to say that it is just based on a reinterpretation of the negation. Brouwer avoids the usual negation non-A, and takes instead "A is absurd". It is then obvious that the general alternative "Every sentence A is true or absurd" is not justifed.

#### **4 Intuitionism**

The fact that changing from classical to intuitionistic logic does not resolve the paradoxes, neither Russell's nor the Liar, leads to the conclusion that one cannot hold

<sup>8</sup> The original version of Curry, allowing arbitrary formulas in the consequence is, in some sense, more general, as it involves as a particular case "Löb's Theorem" (which, however, is not really a paradox any longer, at least not in the sense that it leads to a contradiction). The relation between Curry's paradox and Löb's Theorem is an interesting issue in itself; see, for instance, Ruitenburg (1991).

<sup>9</sup> Therefore, it seems to be misleading to say, as Benthem (1978, p. 49), that "Curry's paradox shows that negation is not essential in this connection"; it is just the negation *symbol* which appears not to be essential.

<sup>10</sup> German original: "Wie man weiß, ist die Verwendung des 'tertium non datur' in bezug auf unendliche Gesamtheiten, insbesondere schon in der Arithmetik, von L. E. J. Brouwer angefochten worden, und zwar in der Form einer Opposition gegen das traditionelle logische Prinzip vom ausgeschlossenen Dritten. Gegenüber dieser Opposition ist zu bemerken, daß sie ja auf einer Umdeutung der Negation beruht. Brouwer vermeidet die übliche Negation nicht-A, und nimmt stattdessen 'A ist absurd'. Es ist dann klar, daß eine allgemeine Alternative 'Jede Aussage A ist wahr oder ist absurd' nicht berechtigt ist."

the logic responsible for them.11 Thus, one should look for the concept formations involved in the paradoxes when searching for solutions.

In standard semantics, one takes care of concept formations by careful choices of interpretations. This involves either a staunch platonistic insight in the interpretation or, at least, a frm confdence in set-theoretic constructions for them.12

Zermelo (1908) gave the classical example of *concept formation* when he axiomatized set theory with the explicit aim to ban the paradoxes:

Under these circumstances there is this point nothing left for us to do but [. . .] to seek out the principles requried for establishing the foundations of this mathematical discipline. In solving the problem we must, on the one hand, restrict these principles sufciently to exclude all contradictions and, on the other, take them sufciently wide to retain all that is valuable in this theory.13

The principles of set theory, of course, serve as (implicit) defnitions of the settheoretical concepts. In practice, Zermelo's set theory fully satisfes the needs of the mathematicians. But Poincaré was not convinced (cited according to Gray, 2013, p. 540): "But even though he has closed his sheepfold carefully, I am not sure that he has not set the wolf to mind the sheep." Thus, without a consistency proof for the axiomatized set theory, the situation remains unsatisfactory from a philosophical point of view.

Interestingly, also Brouwer can cope with the problem, insofar one puts the concept formation ahead of the logic. This is in line with his idea that mathematics goes ahead of logic.14 For Russell's paradox, one may note that Brouwer clearly rejected Cantorian set theory as such and abstract set formation principles are plainly antiintuitionistic. In the same way, formalizations of the Liar and Curry's paradox depend on self-referential features of formal languages — to be implemented by some kind of Gödelization. But such formal languages are not the subject of intuitionism. In this perspective, the paradoxes may even support Brouwer's anti-logical convictions.

This perspective also vindicates Weyl (1987, p. 5), who was using his criticism of (the scholasticism around) the Grelling–Nelson paradox, not to advocate a manyvalued logic, but rather to demand a careful delimitation of the "categories" to which a meaningful proposition is afliated.

Here is not the place to evaluate the success of intuitionism to provide convincing techniques for concept formation. Brouwer coined the name with reference to the

<sup>11</sup> Here, we are not going into the attempts to mutilate further the logical framework (as, for instance, by questioning *modus ponens*); nor do we discuss informal notions of provability (Weaver, 2012) or validity versions of the Liar (Beall and Murzi, 2013) which are sometimes used to clarify the situation.

<sup>12</sup> See, for instance, Feferman (2000, p. 72).

<sup>13</sup> Zermelo (1967, p. 200). German orginial (Zermelo, 1908, p. 261): "Unter diesen Umständen bleibt gegenwärtig nichts anderes übrig, als [. . .] die Prinzipien aufzusuchen, welche zur Begründung dieser mathematischen Disziplin erforderlich sind. Diese Aufgabe muß in der Weise gelöst werden, daß man die Prinzipien einmal eng genug einschränkt, um alle Widersprüche auszuschließen, gleichzeitig aber auch weit genug ausdehnt, um alles Wertvolle diese Lehre beizubehalten."

<sup>14</sup> See the third chapter of Brouwer's dissertation, reprinted in English translation in L. E. J. Brouwer (1975), which contains the theses: *Mathematics is independent of logic* and *Logic depends upon mathematics*.

intuition of mathematicians. The resulting risk of subjectivity was criticised by Lorenzen with respect to the rejection of the *tertium non datur*:15 "Unfortunately, the explanation which Brouwer himself ofers for this phenomenon is an esoteric issue: only one who listened the Master himself understands him." Anyhow, it is the supposed intuition which should save the intuitionist from contradictions, not the underlying logic.

To avoid misunderstandings, we have to stress that Brouwer's conception of intuitionism was by no means motivated by the paradoxes — quite contrary to Hilbert's motivations for his foundational research.16 In fact, Brouwer did not comment on the paradoxes at all, except for a plain rejection of Cantorian or axiomatic set theory in his dissertation and in a paper of 1912; cf. L. E. J. Brouwer (1975, pp. 80f. and 130f.).

#### **5 Proof-theoretic semantics**

Proof-theoretic semantics "is based on the fundamental assumption that the central notion in terms of which meanings are assigned to certain expressions of our language, in particular to logical constants, is that of *proof* rather than *truth*. In this sense proof-theoretic semantics is *semantics in terms of proofs*" (Schroeder-Heister, 2016b). "Proof-theoretic semantics is intuitionistically biased" (Schroeder-Heister, 2016b, §3.5)17. As such, it is confronted with the paradoxes in the very same way as intuitionism itself; but, as we will see, there is an additional challenge.

In a frst step, proof-theoretic semantics may follow the "solution" we attributed to Brouwer: turning to the particular concept formations. In the case of Russell's paradox, this means that one would have to provide a proof-theoretic semantics for the set formation principles (expecting that such a semantics blocks the possibility to introduce the "Russell set"). Such an approach was, in fact, already initiated by Hallnäs (2016). With respect to the Liar in its usual form, one needs a framework axiomatizing truth and providing some form of term representation of formulas (as *Gödelization*). There are plenty of truth theories around and giving them a proof-theoretic semantics can be subsumed under the "open problem" of *Proof-Theoretic Semantics Beyond Logic* addressed by Schroeder-Heister (2016a, §4).

However, there is another form of treating the liar in a formal theory, which constitute a genuine challenge to proof-theoretic semantics. One may axiomatize a self-contradicting atom with ↔ ¬. Schroeder-Heister (2012a) introduced such an in a sequent calculus by the following two rules:

<sup>15</sup> German original in Lorenzen (1960): "Unglücklicherweise ist die Erklärung, die Brouwer selbst für dieses Phänomen anbietet, eine esoterische Angelegenheit: nur, wer den Meister selber hörte, versteht ihn."

<sup>16</sup> For Hilbert's motivation, see Kahle (2006).

<sup>17</sup> "Most forms of proof-theoretic semantics are intuitionistic in spirit, which means in particular that principles of classical logic such as the law of excluded middle or the double negation law are rejected or at least considered problematic." (Schroeder-Heister, 2016b, §1.2)

Paradoxes, Intuitionism, and Proof-Theoretic Semantics 369

*Paradoxical rules*

$$\frac{\Gamma, \neg R \vdash\_{\bot} C}{\Gamma, R \vdash\_{\bot} C} (R \vdash) \qquad \qquad \frac{\Gamma \vdash\_{\bot} \neg R}{\Gamma \vdash\_{\bot} R} (\vdash R)$$

In a fne analysis of the contradiction, which one can derive from these rules (together with the usual structural rules and rules for negation), Schroeder-Heister shows that the contradiction can be blocked by imposing some restriction on each of the structures rules of Identity, Contraction, or Cut — restrictions which would not harm " 'ordinary' mathematical reasoning" (Schroeder-Heister 2012a; 2016c). From the point of proof-theoretic semantics, however, we don't see that such restrictions in fact, any switch to sub-structural logics — is justifable, as one would not like to dismiss the usual structural rules in other contexts.

Substructural logics (Došen and Schroeder-Heister, 1993) do not intended to mutilate logics (see footnote 11), but they rather serve as tactical modifcations of standard logic to obtain a fne-grained analysis of the interplay of diferent strutural operations. Some of the substructural logics turned out to be of interest in special applications, as the Lambek calculus (Lambek, 1958) in linguistics and Linear Logic (Girard, 1987) for resource aware reasoning. The "sub"structural character of the latter is manifest in the possibility to reestablish classical reasoning by use of the bang operator. Intuitionistic logic, however, when considered as a substructural logic in the form of mono-succedent sequent calculus has further claims. Brouwer and Weyl were aiming, indeed, to replace classical reasoning in Mathematics. And they invoked sofsticated philosophical arguments — although these arguments were dismissed (or ignored) by the mathematical community at large. But they neither addressed the paradoxes nor were concerned with technical properties of calculi.

If, thus, classical reasoning should not be dismissed in general, proof-theoretic semantics should provide an argument to invalidate directly the mentioned *paradoxical rules*. We are facing here a "*tonk*-like phenomenon",18 and as such it is discussed in Tranchini (2016). As for *tonk*, the sheer defnition of ( ⊢) and (⊢ ) would spoil our calculus. To deal with *tonk*, *proof-theoretic harmony* was conceived as a possible solution.19 But the two paradoxical rules appear to be in perfect harmony; and this was already observed by Read (2010, §7).

Next to harmony — as the "frst principle" of proof-theoretic semantics —, we can consider a "second principle": *normalizability of proofs*. In fact, Tennant (1982) used

$$\begin{array}{cc} A & \text{tonk-I} \\ \hline A \text{ tonk-B} \end{array} \text{(tonk-I)} \qquad \qquad \qquad \begin{array}{c} A \text{ tonk-B} \\ \hline B \end{array} \text{(tonk-E)}$$

<sup>18</sup> The "tonk" connective was introduced by Prior (1960) by the following two rules:

It is widely discussed in the literature and, for our context, we may refer to Read (2010) or Tranchini (2016) for further information.

<sup>19</sup> This concept is due to Dummett (1981; 1991); for a recent discussion see, for instance, Tranchini (2021).

normalizability20 exactly to block the paradoxical rules under discussion.21 This is today one of the main directions proof-theoretic semantics is following and Tranchini (2015; 2016) proposed, quite convincingly, a combined treatment of *tonk* and the paradoxical rules in this vein by combining harmony and normalizability.

The problem with the "second principle" is that it is *global* and not any longer *local*; i.e., we cannot assign a proof-theoretic meaning to the connectives by solely inspect the given rules, but we have to prove properties of derivability in general.22 In the last consequence, we are confronted with Hilbert's concern as expressed in the citation at the beginning of the paper: the admissibility of a proof can only be verifed *a posteriori*.23

But even using normalizability as a proof-theoretic principle, we have no superordinate philosophical argument that this blocks *all* potential contradictions; we just verifed empirically that it will block the liar.24 There is an important lesson to learn from Martin-Löf's frst type theory (Martin-Löf, 1971). It was conceived in a way that it was not subject to a direct liar-like contradiction; only a much more subtle reasoning, expressed in *Girard's paradox* showed its inconsistency (Coquand, 1986; Hurkens, 1995).25 Thus, global proof-theoretic conditions which block "one or another" paradox might be far from being sufcient to convince one from the consistency of a system as a whole.

With reference to Hallnäs (1991; 2006), Schroeder-Heister proposes a possible solution: *Defnitional Freedom* (Schroeder-Heister 2012b, §2 and 2016a, p. 276). Under this freedom, one does not forbid any rules, but has to single out the "wellbehaved" ones by (*a posteriori*) mathematical arguments. Qualitatively, this was already done by Gödel, when he formalized the sentence "I am not provable in " in an arithmetical (consistent) theory .26 The defnition is perfectly fne, but the

<sup>20</sup> Ekman (2016, p. 212) observed that non-normalizability can be related to some form of "overloading" of (the use of) propositions: "A *self-contradictory argument* is, informally, an argument [. . .] in which there is a proposition which is used in two or more ways such that not all of the ways of using the proposition are compatible."

<sup>21</sup> For critical evaluations of Tennant's approach see, for instance, Schroeder-Heister and Tranchini (2017) and Petrolo and Pistone (2019).

<sup>22</sup> See also *Local and Global Proof-Theoretic Semantics* (Schroeder-Heister, 2016a, §2.4).

<sup>23</sup> In principle, normalizability could be proven, for a given set of axioms, before performing the single proofs. But there are two problems: frst, normalizability will be, in general, an undecidable property; second, proof-theoretic semantics would depend on such a (meta-)proof of normalizability, which is rather delicate with respect to the philosophical claim of proof-theoretic semantics.

<sup>24</sup> More exactly: the liar and some other known paradoxes; see Tennant (1982). Tennant is well aware of the limitations of his approach: "I fully realise how inadequate any supposedly fnal word on this matter [the paradoxes] would be." (Tennant, 1982, p. 278). Thus, we are still in need for a *philosophical* argument that normalizability is more than an "ad hoc reply" to the known paradoxes. 25 Admittedly, Girard's paradox will not be detected by a "slightest consideration"; thus, the situation

is far more complex than Weyl might have judged it in 1917.

<sup>26</sup> Gödel explicitely refers to the paradoxes as heuristic motivation: "The analogy between this result and Richard's antinomy leaps to the eye; there is also a close relationship with the 'liar' antinomy"; in a footnote he continues: "Every epistemological antinomy can likewise be used for a similar undecidability proof" Gödel (1931, p. 175, translated); see also Lethen (2021).

provability predicate of turns out to be incomplete, i.e., not every sentence is either provable or refutable in terms of such a formal predicate.27

The situation is also exemplifed in recursion theory, where one does not like to forbid the defnition of partial functions, but rather likes to single out, *a posteriori*, the functions which are total (or the domain of a partial function).28 For recursion theory, there are adequate formal systems to incorporate the reasoning about partiality, namely *free logics* (Bencivenga, 2002) or the *logic of partial terms* (Beeson, 1985). For logical calculi, there exist a largely forgotten attempt by Behmann (1959)29, but a modern worked out formalism is still a desideratum. Incorporating the reasoning about the "well-behaviour" of defnitions would, in fact, vindicate both, Weyl and Hilbert: for Weyl, the slightest (or not so slight) consideration about the sense of a defnition would turn explicit; for Hilbert, the admissibility of (the concepts used in) a proof would be checked not only at the result, but — if not globally, which we cannot expect any longer in view of undecidability phenomena — at least locally, for every proof in advance.

In fact, it is one of the features of the paradoxes that they work with *locally correct reasoning*.30 According to our analysis, the paradoxes are not phenomena of the underlying logic, but of the concept formations; therefore, a proper treatment has to take into account the reasoning about these concepts. For proof-theoretic semantics such a reasoning should be a part of the game, and we agree with Schroeder-Heister (2012b, p. 78) to allow such reasoning within the formal frameworks:

We strongly propose defnitional freedom in the sense that there should be one or several formats for defnitions, but within this format one should be free. Whether a certain defnition is well-behaved is a matter of (mathematical) 'observation', and not something to be guaranteed from the very beginning.

#### **References**

Beall, J. C. and J. Murzi (2013). Two favors of Curry's paradox. *Journal of Philosophy* 110, 143–165.

Beeson, M. (1985). *Foundations of Constructive Mathematics*. Ergebnisse der Mathematik und ihrer Grenzgebiete; 3. Folge, Band 6. Springer.

<sup>27</sup> Apparently, it was Russell, who didn't get the point here, when he argues as if Gödel had shown an inconsistency in mathematics in a letter to Leon Henkin of 1 April 1963, cited in Dawson, Jr. (1988, p. 89f).

<sup>28</sup> The common underlying reason for partiality in recursion theory and incompleteness in arithmetic is, of course, *diagonalization*. It is the source of the very most paradoxes, and *defnitional freedom* requires to deal positively with it, rather than forbitting it. Discussions of such positive handling in Mathematics can be found, for instance, in Sommaruga-Rosolemos (1991) and Santos (2020).

<sup>29</sup> See also Thiel (2019).

<sup>30</sup> Schroeder-Heister (2016a, §4.2) remarked this in the context of clausal defnitions: "This connects the proof theory of clausal defnitions with theories of paradoxes, which conceive paradoxes as based on locally correct reasoning."


Santos, P. G. (2020). *Diagonalization in Formal Mathematics*. BestMasters. Springer.


Tranchini, L. (2015). Harmonising harmony. *The Review of Symbolic Logic* 8, 41–423.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **On the Structure of Proofs**

Lars Hallnäs

**Abstract** The initial premise of this paper is that the structure of a proof is inherent in the defnition of the proof. Side conditions to deal with the discharging of assumptions means that this does not hold for systems of natural deduction, where proofs are given by monotone inductive defnitions. We discuss the idea of using higher order defnitions and the notion of a functional closure as a foundation to avoid these problems. In order to focus on structural issues we introduce a more abstract perspective, where a structural proof theory becomes part of a more general theory of functional closures. A notion of proof equations is discussed as a structural classifer and we compare the Russell and Ekman paradoxes to illustrate this.

## **1 Introduction**

If we consider proofs represented in a formal calculi such as a system of natural deduction or a sequent calculus it is a tree structure of some sort. From the elementary *monotone* inductive defnition of a proof in a sequent calculus we can derive an explicit tree structure, but it is an indirect representation as it is a calculus of derivability. In natural deduction the situation is complicated by the discharging of assumptions, a straightforward monotone inductive defnition has to be complemented by side conditions. Such a calculus does not in a direct manner provide a mathematically satisfying foundation for a structure theory of proofs. The basic problem in systems of natural deduction is that we try to represent a *functional* structure in terms of a function closure which force us to add side conditions that makes structural issues difcult to handle and unclear with respect to classifcations.

While rules of conjunction introduction and implication elimination, i.e., modus ponens, relates to the application of an elementary function that builds the proof from given proofs of the premises, the implication introduction rule is diferent as it builds

Lars Hallnäs

© The Author(s) 2024 375

The Swedish School of Textiles, University of Borås, Sweden, e-mail: lars.hallnas@hb.se

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_13

on *conditional* reasoning. It is *functional* in nature, it is not naturally element of a function closure, but of a *functional closure*.

The inferences we make as we build a proof can be local in nature, i.e., given a series of premises we draw a direct conclusion. Given a proposition stating that is an element of a set we look up the defnition of the set and draw the conclusion that satisfy these defnitional conditions. As a rule of inference this is at the level of functions on a certain domain of objects. The proof we obtain so far is simply built by applying a function to the proofs of the given premises.

Inferences may also be more non local in nature as they depend not only on the proofs of the immediate given premises, but further more on a conditional argument. We may assume a certain proposition to be true and using this assumption proving absurdity, thus concluding that ¬ must be true. Here, the conclusion that ¬ must be true depends on a function that given a proof of produces a proof of absurdity. The rule of inference we apply here is at the level of functionals operating on functions over a certain domain of proofs.

In this paper we will discuss a general and more abstract approach to a structure theory of proofs, where proofs simply will be viewed as a special class of functional closures. These ideas have been discussed briefy in earlier work (Hallnäs, 2006). Here we will discuss this approach in more detail and introduce some foundational notions that can be used to discuss the structural complexity and structural similarities of proofs.

As we discuss structural properties of proofs in terms of functional closures in this paper we disregard, or bracket, provability more or less completely, i.e., we bracket what it is a proof proves. It could be tempting to use some sort of propositions-as-type system in order to relate functional closure *terms* with the propositions that a proof proves. But that would be rather strange since the whole idea here is to search for a mathematically simpler abstract structure theory of proofs and introducing a type system would be just to complicate matters without any real gain.

#### **2 Defnitions**

To model functional closures we will use the framework of partial inductive defnitions (Hallnäs, 1991; Hallnäs and Schroeder-Heister, 1991; Schroeder-Heister, 1993), which extends the monotone inductive defnitions with higher order, possible non-monotone, defnitions.

A *defnition* is a function

$$D:Dom(D) \to \mathcal{P}(CoDom(D)).$$

Intuitively we may think of as a collection of, possible non deterministic, equations

$$a = A$$

saying that the *atom* is defned by the *condition* , where ∈ *Dom*() for some

given universe and where ∈ is a condition built up from objects in *Dom*(), ⊤ and ⊥ using (possibly infnitary) conjunctions Ó and arrows →. The logical interpretation of a defnition is given by a *local* notion of consequence Γ ⊢ , where Γ ⊂ *Dom*() and ∈ *CoDom*:

Γ, ⊢

$$\begin{array}{cc} \Gamma \vdash\_{D} \mathsf{T} & \Gamma, \bot \vdash\_{D} C \\\\ \frac{\Gamma \vdash\_{D} A\_{i}}{\Gamma \vdash\_{D} \bigwedge\_{i} A\_{i}} (i \in I) & \frac{\Gamma, A\_{i} \vdash\_{D} C}{\Gamma, \bigwedge\_{i} A\_{i} \vdash\_{D} C} (i \in I) \\\\ \frac{\Gamma, A \vdash\_{D} B}{\Gamma \vdash\_{D} A \rightarrow B} & \frac{\Gamma \vdash\_{D} A \quad \Gamma, B \vdash\_{D} C}{\Gamma, A \rightarrow B \vdash\_{D} C} \\\\ \frac{\Gamma \vdash\_{D} A}{\Gamma \vdash\_{D} a} (A \in D(a)) & \frac{\Gamma, A \vdash\_{D} C}{\Gamma, a \vdash\_{D} C} (A \in D(a)) \end{array}$$

The property over *Dom*() defned by is the collection of atoms true in the derived logic *Def*() = { | ⊢ } and () = { | ⊢ } is the conditions that *covers* the *true* objects and *valid* connections.

#### **3 The functional closure**

Let ⊂ and let 1, . . . , be functions with arities 1, . . . , over. The *function closure* defned by (, 1, . . . , ) is the class obtained by starting with and closing it under the given functions, i.e., the class given by the defnition (, 1, . . . , ):

$$\begin{array}{ccc} \text{(l)} & \text{ } & a = \top & (a \in X) \end{array}$$

$$f\_i(\mathbf{x}\_1, \dots, \mathbf{x}\_{k\_i}) = (\mathbf{x}\_1, \dots, \mathbf{x}\_{k\_i}) \quad (i \le n)$$

the smallest class containing and closed under the functions 1, . . . , .

The intuition behind the function closure is simply that we build a collection of objects by starting with some given *atoms* and the successively close our collection under the given functions—a monotone operation.

Now assume ⊂ , functions 1, . . . , with arities 1, . . . , over , a functional : [ → ] → and a class K ⊂ [ → ], the *functional closure* with respect to (, 1, . . . , , , K) is given by the following defnition (, 1, . . . , , , K):

$$\begin{aligned} a &= \top \quad (a \in X) \\ f\_i(\mathbf{x}\_1, \dots, \mathbf{x}\_{ki}) &= (\mathbf{x}\_1, \dots, \mathbf{x}\_{ki}) \quad (i \le n) \\ F(f) &= \bigwedge\_{\text{dom}(f)} (\mathbf{x} \to f(\mathbf{x})) \quad (f \in \mathcal{K}) \end{aligned}$$

The functional closure builds on this idea adding a further abstraction, the closure

under given functionals. Intuitively this means that we build our collection of objects at the sam time as we build the function space of this very collection of objects—a possible non monotone operation.

#### **4 Proof structures**

The idea of using the functional closure, or more precisely a class of functional closures, as a foundation for a structure theory of proofs, is that this is a very elementary and general foundation that captures the distinction between local and non-local rules of proofs in a precise manner. Another way to say this is that it is a general abstraction of natural deduction based on inductive defnitions without side conditions for non-local rules, i.e., that the structure of the proofs is internal to the inductive defnition of them. The defnition of a proof gives a local logic of the proof and the defnition itself is closely related to the idea of validity (Prawitz 1971; 1973b; 1973a; Schroeder-Heister 2006; 2012) of a proof modulo rules of proof contractions.

As a certain kind of abstract proof theory this type of structure theory becomes part of the general study of the functional closure. Initially this calls for axioms that set up boundaries and frames such a theory.

We will identify a *proof structure* with a functional closure

$$DF\_1, \dots, F\_m = D(X, f\_1, \dots, f\_n, F\_1, \dots, F\_m)$$

(we use for short). We assume that is deterministic.

A *proof* is then intuitively a focal point ∈ *Def*(), we refer to these focal points as *proof objects*. But what is of main interest here is the defnition of the proof that presents the structure of the proof.

As an example consider the following proof structure *IE*(, , , , , ):

$$\begin{aligned} a &= \top \\ b &= \top \\ ie(\mathbf{x}, \mathbf{y}) &= (\mathbf{x}, \mathbf{y}) \\ ki(\mathbf{x}, \mathbf{y}) &= (\mathbf{x}, \mathbf{y}) \\ F(f) &= \bigwedge\_{\text{dom}(f)} (\mathbf{x} \to f(\mathbf{x})) .\end{aligned}$$

In this example is universe of terms, and constants, the functions and are *terms functions* and a *term functional*, i.e., (, ), (, )) and ( ) are the term themselves. The *proof terms* , the notation, we use for proof objects builds on variables and constants for objects in a given domain, function terms ( (1, . . . , ) and functional terms (( )). We will refer to a proof structure that builds on a term universe and where the functions 1, . . . , and the functionals 1, . . . , are all term functions and term functionals as a *term proof structure*.

Consider the following proof object

$$\operatorname{ie}(F(f), \operatorname{ie}(a, b)), \text{ where } f(\mathbf{x}) = k i(b, \mathbf{x}).$$

Given that = → and = this proof object is a representation of a proof of ∧ in the following sense:

Assume and assume . By ∧-introduction we get ∧ , this is (, ). It is clear that the defnition is locally provable closed under the function () = (, ). By →-introduction we get → ∧ . This is ( ). Now assume → and , by →-elimination we have . This is (, ). Finally by →-elimination we have ∧ , i.e., (( ), (, )).

The *normal form* of this proof is the proof object (, (, )). In natural deduction for propositional logic this corresponds to:

$$\begin{array}{ccc} \frac{p \quad [q]}{p \land q} & \frac{p \to q \quad p}{q \to q} & \Rightarrow & \frac{p \quad [p \to q \quad p]}{p \land q} \\ \hline \end{array}$$

Note that →-introduction will be represented by functional objects ( ) so a *cut* will have the form (( ), ), whereas the situation will be reversed for the ∨-rules where ∨-elimination is represented by functional objects ( , , ) and the ∨-introduction-rules are represented by function objects ℎ<sup>1</sup> (), ℎ<sup>2</sup> (), which somehow introduce a reverse cut.

It is also possible to develop a certain *model theory* of proofs using the local logic of a proof structure, i.e., a certain proof semantics.

Given ⊂ *Dom*(), let + be the defnition obtained by adding defnitional clauses = ⊤ for all ∈ to the clauses of . Now we defne |= to hold if ∈ ( + ). Using this we can defne a notion of logical consequence in model theoretical terms:

$$A \vdash\_D B \text{ iff } M \vdash\_D A \text{ implies } M \vdash\_D B.$$

#### **5 Proof unfolders**

The local logic of a proof structure places proof objects in a logical context. A more direct structural approach would be to *unfold* the proof structure at the given object, i.e., going backwards to trace the way in which the object is built up in the given structure. What this means with respect to the function closure is obvious, it is the local step of simply going one step back.

Given a proof structure and a proof object ∈ *Def*(), we defne the notion of *subobjects*, (, ) as follows:

$$\begin{aligned} &a \in S(D\overline{F}, a). \\ &\text{If } f(b\_1, \dots, b\_n) \in S(D\overline{F}, a), \text{ then } b\_1, \dots, b\_n \in S(D\overline{F}, a). \\ &\text{If } F(f) \in S(D\overline{F}, a), \ b \in S(D\overline{F}, a) \text{ and } b \in \text{dom}(f), \text{ then } f(b) \in S(D\overline{F}, a). \end{aligned}$$

Given a proof structure *DF* and a proof object ∈ *Def*(*DF*) we defne two elementary operators, *proof unfolders* for objects in (*DF*, ):

(Δ) Δ(*DF*, (1, . . . , ), ) =

(Φ) Φ(*DF*, ( ), ) = (), this operation is defned whenever () is defned.

The Δ-unfolder unfolds the function closure, i.e., the structure build up by applying functions to given objects. In rule terms this correspond to looking at the premises in a rule application where no assumptions are discharged, note that there is no distinction here between introduction and elimination rules. We make for example no specifc distinction between ∧-introduction and ∧-elimination and in a system of set theory between ∈-introduction and ∈-elimination. The reason for this is simply that there is no structural diference, at least locally, to consider. This opens up for questions about how to characterize directions and ways of reason in a proof.

The Φ-unfolder unfolds the functional closure, i.e., the structure built up by applying a functional to a function over a given domain. In rule terms this correspond to look at the function premise in a rule application where assumptions are discharged, and apply the function to a subproof for which the function is defned, note that this unfolder is not local in nature as it both works on a function and also use a subproof that does not need to be local to the rule application

In a natural way we may build more complex unfolders by composition:

$$\begin{aligned} \Delta \Phi(DF, F(f), b, i) &= \Delta(DF, f(b), i) \\ \Phi \Delta(DF, F(f), g(b\_1, \dots, b\_n), i) &= \Phi(DF, F(f), b\_i) \\ \Delta \Lambda(DF, f(a, g(b, c)), 2, 2) &= \Delta(DF, g(b, c), 2) \end{aligned}$$

Alternations of unfolders, i.e., ΔΦΔΦ . . ., involves increasing complexity. We will refer to the index as the Δ-index.

These proof unfolders can be generalized to unfolders for a larger collections of functional closures in the following way:

$$a \in S(D\overline{F}, a).$$

$$\text{If } b \in S(D\overline{F}, a), \text{ then } D\overline{F}(b) \subset S(D\overline{F}, a).$$

$$\text{If } \bigwedge\_{I} A\_{i} \in S(D\overline{F}, a), \text{ then } A\_{i} \in S(D\overline{F}, a) \text{ for } i \in I.$$

$$\text{If } (b \to f(b)) \in S(D\overline{F}, a) \text{ and } b \in S(D\overline{F}, a), \text{ then } f(b) \in S(D\overline{F}, a).$$

For the unfolder operators we need to add a projection operator:

$$
\Delta(DF, b) = B,\text{ where } B \in DF(b),
$$

$$(\Phi) \qquad \qquad \qquad \Phi(D, F(f), b) = f(b),$$

$$(\Theta) \qquad \qquad \Theta(DF, \bigwedge(B\_i)\_I, j) = B\_{j} \dots$$

The Δ-unfolder unfolds the defnition of the proof by giving us defniens for

defniendum, which includes unfolding the function closure, i.e., the structure build up by applying functions to given objects. Note that in the general case when the given defnition is non deterministic the Δ unfolder will also be non deterministic. The Θ-unfolder is the projection operator.

Initial composition of these unfolders looks like this:

$$\begin{aligned} \Delta \Phi(DF, F(f), b) &= \Delta (DF, f(b)) \\ \Theta \Delta (DF, b, j) &= \Theta(DF, \wedge\_I B\_i, j) \text{ if } DF(b) = \{\wedge\_I B\}. \end{aligned}$$

We leave the proof structure term out in the unfolders when the context is clear, i.e., we simply write Φ(( ), ) instead of Φ(, ( ), ) when the proof structure context is given.

Le us consider the example above in Section 4 of a proof structure *IE*(, ) again, and look at the proof object (( ), ( → , )):

$$(\Phi(F(f), i e(p \to q, p)) = k i(p, i e(p \to q, p)),$$

which is the *normal form* of the proof that (( ), ( → , ) represent.

$$\begin{aligned} \Delta(ki(p,ie(p\to q,p))1) &= p, \\ \Delta\Delta(ki(p,ie(p\to q,p)),2,1) &= p \to q. \end{aligned}$$

#### **6 Proof equations**

Proof unfolders are, in general terms, *operators* on proof structures. *Proof equations*

$$
\overline{O}\_1(\overline{T}\_1, \overline{i}\_1) = \overline{O}\_2(\overline{T}\_2, \overline{i}\_2).
$$

can be used to characterize structural properties of proofs as they, in a schematic way, display properties of unfolding sequences of proofs. , , are unfolder operator terms, proof object terms and numerals or numeral variables for the Δ operator indexes, respectively.

A proof structure *solves* a given proof equation if there are suitable substitutions over the given proof structure validating the equation.

A typical simple proof equation could look like this:

$$
\Delta''(D, x, i) = \bot
$$

where a proof object validating the equation would have the form 1, . . . , (⊤).

Another elementary example could be taken from Section 4:

$$
\Phi \Delta (F(f), i e(F(f), i e(\mathbf{x}, \mathbf{y})), i) = k i(\mathbf{y}, i e(\mathbf{x}, \mathbf{y})) .
$$

The proof structure *IE*(, ) has then a solution to this equation, i.e.,

$$
\Phi \Delta (F(f), i e(F(f), i e(a, b), \mathfrak{I})) = k i(b, i e(a, b)).
$$

In the general case a simple example could look like:

$$
\Delta^2 \Theta \Delta \Phi (D, X, x, i) = X \dots
$$

#### **7 The paradoxes of Russell and Ekman**

There are two fundamental questions here to dwell on:


With respect to the question Q2, it is clear that we of course can use the ideas of structural unfolders and proof equations to discuss the classifcation of proofs in formal system of natural deduction. As a somewhat canonical example we may compare the Russell and Ekman paradoxes (Prawitz, 1965; Ekman 1994; 1998; Schroeder-Heister and Tranchini, 2017) across diferent formal systems of natural deduction. It is clear that these two proofs build on the same proof idea, i.e., diagonalization. How could we make this more precise? In the present context we can say that both proof structures are circular in terms of functional fxed points, they solve similar fxed-point proof equations. In these cases a functional closure analysis works as a tool to discuss formal proofs in systems of natural deduction, we assign a functional closure to each of these proofs and look for proof equations that we can use to compare the structure of the two proofs. In more general terms this, of course, leads us to a special case of question Q1.

Normalization as a structural procedure makes a distinct diference between the two proofs. The original Russell derivation is a derivation of *absurdity* in implication calculus extended with introduction and elimination rules for *set-membership*, corresponding to full set comprehension (Prawitz, 1965), while the Ekman derivation (Ekman, 1998) is a proof in propositional implication calculus of ¬( ↔ ¬). Both proofs are intuitively circular with respect to normalization as a procedure of removing round-about reasoning, but the example of Ekman is for sure formally normalizable with respect to standard procedures. See Schroeder-Heister and Tranchini (2017) for an in depth discussion on this issue.

As a formal natural deduction derivation R the Russell paradox looks like this, where = { | ¬( ∈ )} and where ¬ is short for → ⊥:

$$
\begin{array}{c}
\mathcal{R}\_1 \\
\begin{array}{c}
\mathcal{R}\_1 \\
\bot
\end{array} \\
\begin{array}{c}
\mathcal{R}\_1 \\
\bot
\end{array}
\end{array}
$$

where

On the Structure of Proofs 383

$$\mathcal{R}\_1 = \frac{\frac{\lbrack t \in t \rbrack}{\neg(t \in t)} \qquad \lbrack t \in t \rbrack}{\frac{\bot}{\neg(t \in t)}} \dots$$

The last introduction of ¬( ∈ ) is a cut formula in the derivation and if we apply the standard rule of reduction we obtain the following derivation

$$\frac{\mathcal{R}\_1}{\frac{\tau(t \in t)}{\tau(t \in t)} \quad \text{if } t \in t}$$

Here ∈ becomes a cut formula. Applying the following rule of reduction

$$\frac{\neg(t \in t)}{\neg(t \in t)} \quad \Rightarrow \quad \neg(t \in t)$$

we will come back to the derivation we started with.

Question Q1 is of a programmatic nature. It would, of course, be possible to perform such an analysis in two steps. First a formalization of proofs in some formal system and then an abstract analysis of these proofs in terms of functional closures. A more direct methods would be to make a *local* and direct analysis of a given proof. There is naturally nothing absolute about such an approach as an informal proof is not in itself a precise mathematical object. The analysis would give a, somewhat abstract, mathematical defnition of the proof in structural terms. Central parts of any non trivial proof involves complex constructions, i.e., defnitions, and it is in each case an interesting problem how to isolate the reasoning core of the proof. The Ekman paradox (Ekman 1994; 1998) is for instance the result of an analysis of the proof of Cantors theorem formalized in a natural deduction system for set theory. Consider the following derivation E of ⊥ from ↔ ¬, where we shorten the proof by using rules for equivalence elimination:

$$\begin{array}{c c c} \mathcal{E}\_1 & \begin{array}{c} p \leftrightarrow \neg p \qquad \mathcal{E}\_1 \\ \bot \\ \bot \end{array} \\ \end{array}$$

where

$$\mathcal{E}\_1 = \begin{array}{c} \begin{array}{c} p \leftrightarrow \neg p \qquad [p] \\ \hline \hline \neg p \end{array} \qquad [p] \\ \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{\$ } p \end{array} \end{array} \end{array} \end{array} \qquad [p] \end{array}$$

Let us consider a *Russell object*, in a term proof system representing the structures of proofs in set theory with full comprehension, i.e., where is a term proof structure including functions that represent the rules for implication and set membership.

Let, once again, = { | ¬( ∈ )}. Assume ∈ , then ¬( ∈ ) by ∈-elimination. This is −1 2 (). From this by →-elimination we have ⊥. This is <sup>1</sup> ( −1 2 (), ) and we have our function . By →-introduction we may conclude ¬( ∈ ). This is

( ). Using this we may conclude ⊥ which is <sup>1</sup> (( ), <sup>2</sup> (( ))), where <sup>2</sup> (( )) represent the ∈-introduction of ∈ , i.e.,

$$r\_1(F(f), r\_2(F(f))), \text{ where } r\_2(F(f)) \in \text{dom}(f) \text{ and } f(\mathbf{x}) = r\_1(r\_2^{-1}(\mathbf{x}), \mathbf{x}).$$

We then have:

ΔΔΔΦ(, ( ), <sup>2</sup> (( )), 1, 1, 1) = ( ),


So the pair (( ), <sup>2</sup> (( ))) solves the following proof equation in :

$$
\Delta^3 \Phi(R, X, Y, 1, 1, 1) = X.
$$

An *Ekman object* in a term proof system representing the structures of proofs in propositional logic could be defned in the following way.

Let = ↔ ¬. Assume , then we may conclude ¬ by ↔-elimination (2). This is ′ 2 (, ) and we may then further conclude ⊥ by →-elimination. This is <sup>1</sup> ( ′ 2 (, ), ) and here we have and we can conclude ¬ by →-introduction which is ( ). Using ( ) we may conclude by ↔-elimination (1), i.e., <sup>2</sup> (, ( )). By →-elimination we then get ⊥, which is <sup>1</sup> (( ), <sup>2</sup> (, ( ))), that is,

<sup>1</sup> (( ), <sup>2</sup> (, ( ))), where <sup>2</sup> (, ( )) ∈ dom( ) and () = <sup>1</sup> ( ′ 2 (, ), ).

Now we have

ΔΔΔΦ(, ( ), <sup>2</sup> (, ( )), 1, 2, 2) = ( ),

$$(1)\ \Phi(F(f),e\_2(a,(F(f)))) = e\_1(e\_2'(a,e\_2(a,F(f))),e\_2(a,F(f))),$$


So the pair (( ), <sup>2</sup> (( ), )) solves the same equation, modulo Δ-index, in as the pair (( ), <sup>2</sup> (( ))) does in , i.e.

$$
\Delta^3 \Phi(E, X, Y, i\_1, i\_2, i\_3) = X \dots
$$

This is then one way to make precise what is intuitively clear, namely that these two proofs have, in principle, the same structure, that the proof idea is the same in the two proofs. The diferences in Δ-index refect the diference between explicit rules in the Russell paradox and implicit rules in the form of assumptions/axioms in the Ekman paradox. But still, both proof systems satisfy the same elementary fxed point equation.

A solution to the equation

$$
\Delta^3 \Phi(DF, X, Y, i\_1, i\_2, i\_3) = X
$$

must be of the form (( ), ) for some , where () is defned and where ( ) ∈ ( ()).

As a function we think of an unfolder term (, , , ) as a *proof form*. Proof equations such as

$$
\Delta^3 \Phi(DF, X, Y, \tilde{i}) = X
$$

then display structural *proof expressions*.

Within the abstract framework of functional closures it is natural to say that a *cut* is a pair (( ), ), and if we consider a proof object (( ), ) to say that (( ), ) is the *main cut* in this proof object. So with respect to the paradoxes of Russell and Ekman we have main cuts producing the same proof expression in relation to the proof form Δ <sup>3</sup>Φ(*DF*, , , 1, 2, 3). The proof equation

$$
\Delta^3 \Phi(DF, X, Y, i\_1, i\_2, i\_3) = X
$$

then expresses the central proof idea common to both proofs.

In this fxed point equation the fxed point ( ) is *hidden* in () for some given and *unlocked* by iterations of the Δ unfolder. The simplest example of such a proof object is the defnition *Id*:

$$F(id) = \wedge\_{Dom(Id)} (\mathfrak{x} \to \mathfrak{x}),$$

where we of course need zero iterations to unlock (*id*).

#### **8 General functional closure interpretations**

Given a functional closure = (, Λ( ), (, ), ()) we may interpret natural deduction derivations in the implication calculus in terms of locally defnable functions over this functional closure in the following way.

Let D () be a derivation in implication calculus with open assumptions . By recursion on D we

(i) defne an interpretation D : → and

(ii) prove that D is locally defnable in , i.e., ⊢ D ().

If D = , let D () = ().

Clearly ⊢ D (), since ⊢ () holds by defnition.

If D () ends with an implication elimination with premise derivations D<sup>1</sup> (<sup>1</sup> ) and D<sup>2</sup> (<sup>2</sup> ), let

$$\pi \mathcal{D}(\overline{a}) = \phi(\pi \mathcal{D}\_1(\overline{a\_1}), \pi \mathcal{D}\_2(\overline{a\_2})), \text{ where } \overline{a\_i} = \overline{a} \mid \overline{p}\_i \text{ for } i = 1, 2.$$

By IH ⊢ D() for = 1, 2. Clearly ⊢ D() (for = 1, 2), so ⊢ (D<sup>1</sup> (1), D<sup>2</sup> (2)). Thus ⊢ D ().

If D () ends with an implication introduction with premise derivation D<sup>1</sup> (<sup>1</sup> ), where <sup>1</sup> = + let

$$
\pi \mathcal{D}(\overline{a}) = \Lambda(f\_{(\overline{a})}) \text{ where } f\_{(\overline{a})}(a) = \pi \mathcal{D}\_1(\overline{a}\_1) \text{ for } \overline{a} = \overline{a}\_1 \mid \overline{p} \text{ and } \overline{a}\_1(p) = a.
$$

 ⊢ Λ( () ) follows from , ⊢ () () for ∈ . By defnition this is , ⊢ D<sup>1</sup> (1). This follows from <sup>1</sup> ⊢ D<sup>1</sup> (1), which follows from IH.

The informal interpretations of the paradoxes of Russell and Ekman corresponds to the obvious term interpretation , where we for the Russell example extend to include functions representing rules for ∈.

#### **9 Reduction characterizations**

Given an interpretation of derivations in natural deduction for implication calculus. An unfolding of proof objects D can be seen as one way to both defne and characterize reductions, this relates to the notion of simple operations on derivations discussed in Hallnäs (1988).

Take the example from Section 4.

$$\begin{aligned} \label{eq:1} \frac{p \quad [q]}{p \land q} \quad & \quad \frac{p \to q \quad p}{p \to q} \quad \Rightarrow \quad \frac{p \to q \quad p}{p \land q},\\ \tau(\mathcal{D}) = ie(F(f), ie(a,b)),\\ \tau(\text{red}\,\mathcal{D}) = ki(b, ie(a,b)) = \Phi(F(f), ie(a,b)).\end{aligned}$$

For the reductions of the Russell derivation we have

$$\begin{array}{ccl} \mathcal{R}\_{1} & \mathcal{R}\_{1} \\ \hline \mathcal{R}\_{1} & t \in t \\ \bot & \bot \end{array} \implies \begin{array}{ccc} \mathcal{R}\_{1} \\ \hline \hline \neg(t \in t) & \mathcal{R}\_{1} \\ \hline \bot \\ \bot \end{array} \implies \mathcal{R}, \\ \tau(\mathcal{R}) = r\_{1}(F(f), r\_{2}(F(f))), \\ \tau(\operatorname{red}(\mathcal{R})) = r\_{1}(r\_{2}^{-1}(r\_{2}(F(f))), r\_{2}(F(f))) \\ &= \Phi(F(f), r\_{2}(F(f))), \\ \tau(\operatorname{red}(\operatorname{red}(\mathcal{R}))) = r\_{1}(F(f), r\_{2}(F(f))) \\ &= r\_{1}(\Delta^{2}(r\_{2}^{-1}(r\_{2}(F(f))), 1, 1), r\_{2}(F(f))) \\ &= \tau \mathcal{R}. \end{array}$$

So Φ defnes/characterizes the frst reduction and Δ 2 the second one. In the form of a diagram:

$$
\begin{array}{c}
\mathcal{R} \stackrel{\scriptstyle \longrightarrow \scriptstyle}{\longrightarrow} \mathcal{R}' \stackrel{\scriptstyle \longrightarrow \scriptstyle}{\longrightarrow} \mathcal{R}' \\
\downarrow \stackrel{\scriptstyle \longrightarrow \scriptstyle}{\longrightarrow} \tau \mathcal{R}' \stackrel{\scriptstyle \longrightarrow \scriptstyle}{\longrightarrow} \tau \mathcal{R}'
\end{array}
$$

This is a structural defnition and wider than the standard one as we know from the Ekman paradox where the interesting reduction of course is the following reduction:

$$\begin{array}{c c c c} & & \mathcal{E}\_1 & & \\ \cline{2-4} p \leftrightarrow \neg p & \multimap p & \multimap p & \\ \cline{2-4} p \leftrightarrow \neg p & \multimap p & \\ \cline{2-4} p \leftrightarrow \neg p & \multimap p & \\ \cline{2-4} \bot & & \\ \cline{2-4} \begin{array}{c c c} \mathcal{E}\_1 & p \leftrightarrow \neg p & \neg p \\ \hline \neg p & \multimap p & \\ \bot & & \\ \end{array} \\ \end{array}$$

(red(red(E))) = <sup>1</sup> (Δ 2 ( ′ 2 (, <sup>2</sup> (, ( ))), 2, 2), <sup>2</sup> (, ( ))) = E.

The expected objection here is that this defnition/characterization does not take into account the distinction between introduction and elimination rules and thus does not respect the intended semantics of the derivation and reduction rules.

Structurally with respect to unfoldings

$$\frac{\frac{t \in t}{\neg(t \in t)}}{t \in t}$$

and

$$\frac{p \land (q \land r)}{\frac{q \land r}{r}}$$

are isomorphic, whereas there is a huge diference semantically. But this isomorphism is local in nature, the semantical diference will leave its traces on the level of global closure properties. So, yes there is really no big diferences between these two derivations if we isolate them. The diference shows itself in the ways we can use them in a bigger context, i.e., as ΦΔ<sup>3</sup> (( ), (( ))) for example.

So why does the Ekman reduction make sense? It is of course a matter of removing some clear roundabout reasoning, but we can also argue from a more structural point of view. The slogan will then be that any unfolding for which Γ ⊢ is an invariant will defne a valid reduction.

Unfolding a proof object tell us about how the proof argument is built in terms of function and functional structure. A proof equation is then a *representation* of such a structure. A fxed point equation represent in that sense a circular argument, i.e., the unfolding sequence just circles around itself. The reason why this circular structure is not visible through standard reductions in propositional logic, the Ekman paradox, is of course that it is hidden in minor premises in contrast to the explicit rules of full comprehension in natural deduction for naive set theory. This is simply something that is not captured by the semantically motivated duality between introduction and elimination rules. It seems as if this is a structural phenomenon, but so is the idea of roundabout reasoning.

Is the Russell paradox R the same proof as the Ekman paradox E? It is clearly a matter of the same proof idea and the diferences between them are defnitely less interesting than the similarities. Using the functional closure objects R and E we may say that the proof expression (Δ <sup>3</sup>Φ(( ), ) = ( )) *show* that they are the same proof modulo a diference in Δ index, i.e., a diference in the choices of premises of Δ unfoldings. So one way of answering the question is to say that we should look at the diference in the ways in which these two derivations solve the fxed point equation:

$$(\Delta^3 \Phi(F(f), Y) = F(f))(\mathcal{R}) \mid (\Delta^3 \Phi(F(f), Y) = F(f))(\mathcal{E}) \,.$$

#### **References**


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Truth-Value Constants in Multi-Valued Logics**

Nissim Francez and Michael Kaminski

**Abstract** In some presentations of classical and intuitionistic logics, the objectlanguage is assumed to contain (two) *truth-value constants*: ⊤ (verum) and ⊥ (falsum), that are, respectively, true and false under every bivalent valuation. We are interested to defne and study analogical constants ‡ , 1 ≤ ≤ , that in an arbitrary multi-valued logic over truth-values V = {1, . . . , } have the truth-value under every (multi-valued) valuation. As is well known, the absence or presence of such constants has a signifcant deductive impact on the logics studied. We defne such constants proof-theoretically via their associated /-rules in a natural-deduction proof system. In particular, we propose a generalization of the notions of *contradiction* and *explosiveness* of a logic to the context of multi-valued logics.

#### **1 Introduction**

In some presentations of classical logic, and more often of intuitionistic logic, the object-language is assumed to contain (two)*truth-value constants*1 (0-ary connectives): ⊤ (verum) and ⊥ (falsum). These are, respectively, true under every valuation (truthvalue assignment) and false under every valuation.

In classical and intuitionistic natural-deduction (ND) proof-systems, the respective introduction and elimination rules (/-rules) for those constants are (with ranging over object-language formulas)

Technion – Israel Institute of Technology, Haifa, Israel, e-mail: francez@cs.technion.ac.il

Michael Kaminski

Technion – Israel Institute of Technology, Haifa, Israel, e-mail: kaminski@cs.technion.ac.il

© The Author(s) 2024 391

Nissim Francez

<sup>1</sup> Those constants should not be confused by other uses of those symbols as additional truth-values, as is done, for example, in bilattice logics.

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_14

$$\frac{1}{\varphi} \begin{pmatrix} \bot E \end{pmatrix} \qquad \qquad \frac{\varphi}{\mathsf{T}} \begin{pmatrix} \mathsf{T}I \end{pmatrix}$$

and there are no (⊥) and (⊤) rules.

Recall that (⊥) is the proof-theoretic source of the *explosion* property of logics admitting this rule.

We are interested to study analogical constants in arbitrary multi-valued logics (*mv*-logics) and their associated /-rules. As is well known, the absence or presence of such constants may have a signifcant deductive impact on such logics, in several respects2. For example:

1. In Avron (2014) it is shown that as long as the propositional constants **t** and **f** are not included in the object-language, any language-preserving extension of any important fragment of the relevance logics R and RMI can have only classical tautologies as theorems.

2. In Avron and Konikowska (2005), it is shown that if the object-language of a multi–valued logic contains truth-value constant for *every* truth-value, there is a signifcant simplifcation of sequent calculi for such a language. Those constants are used to facilitate a distinction between two kinds of derivations in proof-systems for such logics.

3. In Pelletier and Hazen (2019) it is observed that adding constants for the four truth-values of FDE (augmented with a classical-like conditional) renders the language functionally complete (a property not holding for the augmented FDE without such constants).

#### **2 Located formulas and sequents**

For ≥ 2, let V = {1, . . . , } be a collection of truth-values underlying a multivalued logic L with a propositional object-language with some unspecifed connectives defned by truth-tables over V. Furthermore, L is assumed to contain the constants ‡ , 1 ≤ ≤ , where, by defnition, for every valuation

$$
\sigma[\upharpoonright\_{\vec{\alpha}}] = \nu\_{\vec{\iota}}.
$$

Let ˆ = {1, . . . , }.

**Defnition 2.1** A *located formula* (-formula) is a pair (, ), where is an objectlanguage formula and ∈ ˆ.

The intended interpretation of (, ) is that is associated with the truth-value ∈ V.

<sup>2</sup> Those examples are mentioned only for exemplifying the existence of an impact of the presence of the constants. The actual content of those examples is not directly related to what we have to say.

**Defnition 2.2** A *located sequent* (-sequent) Π has the form Γ : Δ, where Γ, Δ are (possibly empty) fnite collections3 of -formulas.

We use for sets of -sequents. Let range over valuations, mapping formulas to truth-values in V; for atomic sentences the mapping is arbitrary, and it is extended to compound formulas so as to respect the truth-tables of the operators. Below, we defne the central semantic notions as applicable to -sequents.

**Defnition 2.3** *Satisfaction* is defned as follows:

$$\begin{aligned} \sigma \Vdash \Gamma : \Lambda &\iff \\ \text{if } \sigma \Vdash \{\varphi\} &= \nu\_k \text{ for all } (\varphi, k) \in \Gamma \text{, then } \sigma \Vdash \{\psi\} = \nu\_j \text{ for some } (\psi, j) \in \Lambda. \end{aligned}$$

Call such a (, ) a *witness* for Γ : Δ's satisfaction by .

*Consequence* is defned as follows:

$$
\Pi \vdash \Pi \; : \Longleftrightarrow \text{ for every } \sigma \colon \sigma \vdash \Pi' \text{ for all } \Pi' \in \Pi \text{ implies } \sigma \vdash \Pi.
$$

We are interested in proof-systems sound and (strongly) complete for this consequence relation. In Francez and Kaminski (2019) and Kaminski and Francez (2021), we present natural-deduction proof-systems over-sequents sound and (strongly) complete for the above consequence relation, constructed from the truth-tables in a uniform way. The multi-valued ND-systems N (over -sequents) with their structural and logical rules for an arbitrary -ary connective are presented in the appendix.


**Defnition 2.4** Consider any *mv*-logic L . The L *mv-contradictions* are the N derivable -sequents Γ : ∅, in which case Γ is *mv*-inconsistent.

Clearly, for every valuation and for any *mv*-consistent Γ, ̸|= Γ : ∅.

**Defnition 2.5** L is *mv-explosive* if for *every* Δ:

$$
\Gamma \,:\, \mathfrak{d} \vdash\_{\Delta^m} \Gamma \,:\, \Delta.
$$

This notion of multi-valued explosion, the derivability of every -sequent from an *mv*-contradiction, is the natural generalization of the usual notion of explosion.

**Observation 2.6** *Due to the presence of the right-weakening structural rule (see the appendix),* every *mv-inconsistent* L *is explosive.*

<sup>3</sup> The exact nature of a collection, e.g., a set or a multi-set, depends on the specifc logic being defned.

#### **3 Bivalent -sequents**

As a preliminary step towards the introduction of general truth-value constants, we recast the traditional /-rules for ⊥ and ⊤ in terms of -sequents. For readability, we use the mnemonic V = {, } instead of {1, 2}.

#### **-rules**

From their defnition, the located constants (⊥, ) and (⊤, ) can never be non-trivially introduced, because there is no s.t. ⟦⊥⟧ = , and no s.t. ⟦⊤⟧ = . The trivial -rules

$$\begin{array}{c} \begin{array}{c} \Gamma : \mathsf{A} \\ \Gamma : \mathsf{A}, (\mathsf{\tau}, f) \end{array} (\mathsf{\tau} I\_f) \\ \end{array} \qquad \begin{array}{c} \begin{array}{c} \Gamma : \mathsf{A} \\ \hline \Gamma : \mathsf{A}, (\mathsf{\tau}, t) \end{array} (\mathsf{\tau} I\_l) \end{array}$$

are just instances of (right) weakening.

On the other hand, every is such that ⟦⊥⟧ = , and ⟦⊤⟧ = . Thus, we have

$$\begin{array}{ccc}\hline\hline\Gamma:\Delta,(\mathsf{T},t) & (\mathsf{\bot}I\_{t}) & \mathsf{\top} \\\hline\end{array} \quad \begin{array}{ccc} \hline\Gamma:\Delta,(\mathsf{\bot},f) & (\mathsf{\bot}I\_{f}) \\\hline\end{array}$$

Note that the assignments of to ⊥ and to ⊤ depend on no (sub)formulas of Γ or Δ, so the rules (⊥ ) and (⊤) *have no premises*, and are, therefore, special cases of the general /-rules in the appendix.

#### **-rules**

$$\frac{\Gamma:\Delta,(\bot,t)}{\Gamma:\Delta} \ (\bot E\_t) \quad \frac{\Gamma:\Delta,(\top,f)}{\Gamma:\Delta} \ (\top E\_f).$$

Clearly, (⊥, ) cannot be a witness for the premise, and there must be one in Δ itself, which is, therefore, also a witness for the conclusion of (⊥); similarly for (⊤ ). It is quite remarkable that the (⊥) and (⊤ ) rules lead to explosion (as shown below for the general case), naturally generalizing (⊥). Again, both rules can be seen as special cases of the general -rules in the appendix.

#### **4 The general case**

We now apply our uniform construction, yielding the general /-rules in the appendix, to obtain (‡ ) and (‡) for every 1 ≤ ≤ .

(‡): Again, from their defnition, the located constants (‡, ′ ) can never be nontrivially introduced for ≠ ′ , because there is no s.t. ⟦‡⟧ = ′ . The trivial -rules are again instances of (right) weakening.

On the other hand, every is such that ⟦‡⟧ = . Thus, we have for every

Truth-Value Constants in Multi-Valued Logics 395

1 ≤ ≤ :

$$\begin{array}{c}\hline\hline\Gamma\text{ : }\Delta\text{, }(\sharp\_{k},k)\\\hline\end{array}\begin{array}{c}(\sharp\_{k}I\_{k})\\\hline\end{array}$$

Note again that the assignments of to ‡ depend on no (sub)formulas, so the rules (‡ ) *have no premises*, and are special cases of the general -rules in the appendix.

(‡): For every 1 ≤ ≤ :

$$\frac{\Gamma: \Delta, (\pounds\_k, k')}{\Gamma: \Delta} \ (\pounds\_k E\_{k'})\_{, \ k \neq k', \}$$

(‡, ′ ) cannot be a witness for the premise, and there must be one in Δ itself, which is, therefore, also a witness for the conclusion of (‡). The (‡ ′ ) rules can be seen as special cases of the general -rules in the appendix.

*Remark 4.1* Note that the /-rules or the truth-value constant are *harmonious* (Dummett, 1993; Francez, 2015, Ch. 3), albeit *vacuously*: there are no maximal formulas consisting of located constants. This generalizes the vacuous harmony of ⊤ and ⊥.

It is once again quite remarkable that the (‡ ′ ) rules lead to explosion (see Proposition 4.2), naturally generalizing (⊥) and (⊤ ).

**Proposition 4.2** *For any* Δ*,* 1 ≤ , ′ ≤ , ≠ ′ *:*

$$
\Gamma \colon (\pounds\_k, k') \vdash\_{\mathsf{N}^n} \Gamma \colon \Delta.
$$

*Proof* The derivation is

$$\frac{\Gamma : (\nplus\_k k')}{\Gamma : \Delta, (\nvdash\_k k')} \stackrel{(WRs)}{(\pounds\_k E\_{k'})} \_\square$$

#### **5 Conclusion**

We have shown how to introduce truth-value constants into an arbitrary multivalued logic over -sequents. This is done by devising /-rules for those constants, generalizing the traditional /-rules for '⊤' and '⊥' in classical logic.

We also generalized the classical notions of a contradiction and an explosive logic to our setting.

Interesting issues that might be investigated in continuation of this work include:


## **Appendix: The proof-system** N

**Initial -sequents:** For every 1 ≤ ≤ : Γ, (, ) : Δ, (, ). Those initial -sequents render the following *Weakening*-rules admissible:

$$\begin{array}{c} \begin{array}{c} \Gamma : \Delta \\ \Gamma, (\varphi, i) : \Delta \end{array} (WL\_i) \\ \end{array} \\ \begin{array}{c} \begin{array}{c} \Gamma : \Delta \\ \Gamma : \Delta, (\varphi, i) \end{array} (WR\_i) \end{array}$$

**Shifting rules:**

$$\frac{\Gamma, (\varphi, i) : \Delta}{\Gamma : \Delta, \varphi \times (\widehat{n} \mid \{i\})} \; (\overrightarrow{s\_{i}}) \qquad \qquad \frac{\Gamma : \Delta, (\varphi, i)}{\Gamma, (\varphi, j) : \Delta} \; (\overleftarrow{s\_{i, j}}) \; \_{, j \neq i}$$

The rule says, roughly, that the truth-values are *exhaustive*.


#### **Coordination:**

$$\frac{\Gamma: \Delta, (\varphi, i) \quad \Gamma: \Delta, (\varphi, j)}{\Gamma: \Delta} \ (c\_{i, j})\_{, \ i \neq j}$$

This rule is a generalization to multi-valued logic of a structural rule by the same name in Rumftt (2000), in a bilateral setting.

The rule says, roughly, that diferent truth-values associated with the same formula are contradictory, which allows one to eliminate these formula-truth-value pairs, as in the rule of propositional resolution.

**Operational rules:** Those are not really used here, and are presented for completeness only. The guiding lines for the construction are the following, expressed in terms of a generic -ary operator, say '∗'. Notably, the operational rules for the constants can be seen as limit cases of those rules ( = 0).

(∗): Such rules introduce a conclusion Γ : Δ, (∗(1, . . . , ), ). In general, if in the truth-table for '∗' the values for , 1 ≤ ≤ , yield the value for ∗(1, . . . , ), then there is a rule

$$\frac{\{\Gamma : \Delta, (\varphi\_j, i\_j) \mid 1 \le j \le p\}}{\Gamma : \Delta, (\*(\varphi\_1, \dots, \varphi\_p), k)} \ (\*I\_{\vec{\iota}\_1, \dots, \vec{\iota}\_p, k})^\*$$

The rule (∗1,...,,) has, thus, premises.

(∗): Such rules have a major premise Γ : Δ, (∗(1, . . . , ), ). The rule (∗) has the form

$$\frac{\Gamma: \Delta, (\*(\varphi\_1, \ldots, \varphi\_p), k) \quad \{\Gamma, (\varphi\_1, k\_1), \ldots, (\varphi\_p, k\_p) : \Delta \mid \*(\upsilon\_{k\_1}, \ldots, \upsilon\_{k\_p}) = \upsilon\_k\}}{\Gamma: \Delta}$$

for each = 1, . . . , .

A detailed discussion of this system, presented in a diferent but equivalent notation,

can be found in Francez and Kaminski (2019). Equivalent sequent calculi can be found in Kaminski and Francez (2021).

#### **References**


Rumftt, I. (2000). 'Yes' and 'No'. *Mind* 169, 781–823.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Counterfactual Assumptions and Counterfactual Implications**

Bartosz Więckowski

**Abstract** We defne intuitionistic subatomic natural deduction systems for reasoning with elementary *would*-counterfactuals and causal *since*-subordinator sentences. The former kind of sentence is analysed in terms of counterfactual implication, the latter in terms of factual implication. Derivations in these modal proof systems make use of modes of assumptions which are sensitive to the factuality status of the formula that is to be assumed. This status is determined by means of the reference proof system on top of which a modal proof system is defned. The introduction and elimination rules for counterfactual (resp. factual) implication draw on this status. It is shown that derivations in the systems normalize and that normal derivations have the subexpression/subformula property. An intuitionistically acceptable proof-theoretic semantics is formulated in terms of canonical derivations. The systems are applied to so-called counterpossibles and to related constructions.

**Key words:** assumption, conditional logic, counterfactuals, counterpossibles, intuitionistic logic, natural deduction, proof-theoretic semantics

## **1 Introduction**

The notion of assumption is essential to reasoning insofar as reasoning can be characterized as the activity of drawing conclusions from assumptions. It is the purpose of natural deduction systems (Gentzen, 1934; Jaśkowski, 1934) to depict this inferential activity as closely as possible and to lay it down formally in terms of inference rules. In his study of the notion of assumption in proof systems Schroeder-Heister (2004), Peter Schroeder-Heister stresses, using a tree-style format,the following asymmetry between assumptions and assertions (i.e., conclusions) in natural deduction:

Bartosz Więckowski

© The Author(s) 2024 399

Institut für Philosophie, Goethe-Universität Frankfurt am Main, Germany, e-mail: wieckowski@em.uni-frankfurt.de

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_15

there is only an unspecifc way of introducing assumptions, but there is both an unspecifc and a specifc way of introducing assertions. In order to introduce an *assumption* of formula , one only has to state as an assumption. Schroeder-Heister calls this way of introducing an assumption unspecifc, because "the form of is not specifed" (Schroeder-Heister, 2004, p. 27). In order to introduce an assertion of in an unspecifc way, one has to proceed like in the case of introducing an assumption of , thereby making dependent on itself. To introduce an assertion of in a specifc way, one has to derive as a conclusion using an inference rule (where is an axiom, in case the inference rule considered has no premisses). Schroeder-Heister argues that this asymmetry is responsible for limitations of expressive power and considers proof systems which, like the sequent calculus in Gentzen (1934), on his preferred reading, allow for both an unspecifc and a specifc way of introducing assumptions.

In what follows, we take up Schroeder-Heister's call to contribute to the exploration of the notion of assumption and of its signifcance to philosophical logic (cf. Schroeder-Heister, 2004, p. 45). However, we shall remain within the confnes of natural deduction and suggest a way to widen its conception of assumption so as to put natural deduction in a position to "capture and codify reasoning" (Schroeder-Heister, 2004, p. 28) with *would-counterfactuals*, i.e., constructions of the form

(1) If were the case, would be the case.

More precisely, the aim of this contribution is to outline an intuitionistically (or constructively; cf. Dalen, 2002) acceptable formal approach to counterfactual reasoning and to the semantics of *would*-counterfactuals in terms of *modal proof systems* which are motivated directly by the practice of counterfactual inference making. Specifcally, the systems use diferent *modes of making assumptions*.

Modes of assumptions, as we shall understand them in what follows, are dependent on the *factuality status* of the formula that is to be assumed. In a modal proof system this status is determined by means of a *reference proof system* on top of which the modal system is defned. In a nutshell, we assign factual status to a formula , in case has been derived *canonically* (i.e., by means of an application of an introduction rule in the last inference step; cf. Dummett, 1991; Prawitz 2006; 2012) in the reference proof system S. We shall distinguish three modes of making assumptions in modal proof systems. Intuitively, in order to assume in the *factual* mode, we need to make sure that a canonical derivation of in S has been constructed. To assume in the *counterfactual* mode, we need to make sure that a canonical derivation of in S has not been constructed. Finally, to assume in the *independent* mode, we just assume (without any proviso).

Whereas sentences of the form (1) are most adequately used in case does not count as a fact, *causal since-subordinator sentences* of the form

$$\text{(2)}\qquad\qquad\text{Since }A\text{ is the case, it is the case that }B.$$

are most adequately used in case does count as a fact (e.g., Dancygier and Sweetser, 2000, p. 126). Due to their sensitivity to the factuality status of the formula that is to be assumed, our modal natural deduction systems will be also in a position

to deal with such *since*-constructions. Our analysis will focus exclusively on such intuitive uses. More specifcally, constructions of the frst form will be analysed as counterfactual implications, and those of the second form as factual implications. The meaning of the former will be explained by appeal to counterfactual assumptions, that of the latter by appeal to factual assumptions. In order to outline the main idea more clearly, the analysis will be confned to very elementary instances of (1) and (2).

The idea of using diferent ways of making assumptions for the purpose of a proof-theoretic analysis of counterfactual reasoning goes back at least to Thomason's (1970) Fitch-style natural deduction system FCS for Stalnaker's (1970) preferred conditional logic CS. However, not only the details but also the motivation of modal natural deduction systems difers from that underlying Thomason's FCS. We do not aim to formulate a structural proof system which is equivalent with a specifc Hilbert-style axiom system for some specifc conditional logic whose semantics is given non-inferentially. Rather, we shall develop our systems without any Hilbert-system in mind. One reason for this is that the axioms of such systems ultimately depend for their intelligibility on specifc model-theoretic conditions. As a result, axioms may inherit undesirable features from these conditions. Recall, for example, that D. Lewis felt he should apologize for the "long and obscure" axiom ( □→ ∼) ∨ ( ( ( & ) □→ ) ≡ ( □→ ( ⊃ ))) of his simplest Hilbert-style system for VC (cf. Lewis, 2011, p. 133). It is for this reason that he preferred an equivalent axiomatization of VC in terms of his notion of "comparative possibility" of possible worlds (cf. Lewis, 2011, §2.5) rather than in terms of his *would*-counterfactual □→. Furthermore, Hilbert-systems are not indispensable, if our aim is to formulate inferentially intuitive natural deduction systems which have good structural properties (e.g., normalization, subformula property) and which admit a *proof-theoretic semantics* (see Francez, 2015; Kahle and Schroeder-Heister, 2006; Piecha and Schroeder-Heister, 2016; Schroeder-Heister, 2018; Wansing, 2001) that is acceptable from an intuitionistic point of view (see Dummett, 1991, Prawitz 2006; 2012; for model-theoretic semantical considerations on Hilbert-systems for intuitionistic conditional logic see, e.g., Ciardelli and Liu, 2020 and Weiss, 2018).

Specifcally, the intended proof-theoretic semantics is to be *semantically autarkic*, i.e., not defned in terms of a structural proof system that is itself defned by appeal to a formal semantics of a diferent kind (cf. Więckowski, 2021a). Since our approach to counterfactual reasoning is intended to be acceptable from an *intuitionistic* perspective, it will support a verifcation oriented conception of truth (cf. Dummett, 1991, Prawitz 2006; 2012).

The way in which this contribution is organized refects the architecture of a modal natural deduction system. Section 2 defnes the kind of proof system that will be used as reference proof system of such a system. Modal natural deduction systems for reasoning with factual and counterfactual implications will be then defned in Section 3. This section also contains the main results of this contribution (i.e., normalization and the subexpression/subformula property for the intended modal systems) and presents a proof-theoretic semantics with the desired properties. In Section 4, the modal proof systems will be used in an analysis of so-called counterpossibles (see, e.g., Berto, French, Priest, and Ripley, 2018 and Williamson, 2007) and related constructions. Section 5 makes some concluding remarks.

#### **2 Reference proof systems**

We shall now defne the kind of reference proof system on top of which modal natural deduction systems will be defned in Section 3. We frst specify the language and then formulate, in three steps, the intended kind of reference proof system. We choose subatomic natural deduction systems (cf. Więckowski 2011; 2016; 2021b; 2021a; 2023) as our reference proof systems, since we need, as mentioned above, a way to obtain canonical derivations of atomic sentences, if in (1) (resp. (2)) is atomic and if we want to assume in the counterfactual (factual) mode. Unlike standard natural deduction systems, subatomic natural deduction systems maintain introduction and elimination rules also for atomic sentences.

#### **2.1 Subatomic systems**

We frst defne the language 0 that we shall use in the formulation of reference proof systems.

**Defnition 2.1** 0 is a frst-order language which is defned in the usual inductive way. C and P are the sets of individual (or nominal) constants (metavariables: , ) and ary predicate constants (metavariables: , ), respectively. 0-formulae are atomic formulae (form: <sup>1</sup> . . . ), absurdity (⊥), conjunctions (&), disjunctions (∨), implications ( ⊃ ), universal quantifcations (∀ ), and existential quantifcations (∃ ). In addition to defned operators for negation and bi-implication, 0 contains also a special non-primitive identity predicate:


$$K^{n}\_{\varphi^{n}}(o\_{1}, o\_{2}) =\_{def} \forall z\_{1} \dots \forall z\_{n-1} \forall z\_{n} ((\varphi^{n}o\_{1}z\_{2} \dots z\_{n} \leftrightarrow \varphi^{n}o\_{2}z\_{2} \dots z\_{n})),$$

$$\&\quad (\varphi^{n}z\_{1}o\_{1} \dots z\_{n} \leftrightarrow \varphi^{n}z\_{1}o\_{2} \dots z\_{n})$$

$$\&\dots \& (\varphi^{n}z\_{1} \dots z\_{n-1}o\_{1} \leftrightarrow \varphi^{n}z\_{1} \dots z\_{n-1}o\_{2})).$$

Let 1 1 , . . . , be all the predicate constants in P, where is -ary.

$$
\rho\_1 \ddot{=} \rho\_2 =\_{def} K^{k\_1}\_{\varphi\_1}(o\_1, o\_2) \otimes \dots \otimes K^{k\_m}\_{\varphi\_m}(o\_1, o\_2) \,.
$$

is the set of atomic sentences. *Atm*() =*def* { ∈ : contains at least one

occurrence of ∈ C} and *Atm*( ) =*def* { ∈ : contains an occurrence of ∈ P}. Due to the presence of =¥ in 0 we take P to be fnite.

The frst step of the defnition of the intended kind of reference proof system consists in the defnition of a subatomic system. In such systems we may introduce and eliminate atomic sentences using term assumptions for non-logical constants.

**Defnition 2.2** A *subatomic system* S is a pair ⟨I, R⟩, where I is a *subatomic base* and R is a set of *introduction and elimination rules for atomic sentences*. I is a 3-tuple ⟨C, P, ⟩, where C and P are as above, and where is such that:

1. For any ∈ C, : C → ℘(), where () ⊆ *Atm*().

2. For any ∈ P, : P → ℘(), where ( ) ⊆ *Atm*( ).

We let Γ =*def* () for any ∈ C ∪ P, and call Γ the set of *term assumptions* for . R contains I/E-rules of the following form:

$$\begin{array}{c c c} \mathcal{D}\_{0} & \mathcal{D}\_{1} & \mathcal{D}\_{n} & \mathcal{D}\_{1} \\ \varphi\_{0}^{n}\Gamma & \alpha\_{1}\Gamma & \dots & \alpha\_{n}\Gamma \\ \hline & \varphi\_{0}^{n}\alpha\_{1}\dots\alpha\_{n} & (a\text{sI}) & & \frac{\varphi\_{0}^{n}\alpha\_{1}\dots\alpha\_{n}}{\tau\_{i}\Gamma} (a\text{sE}\_{i}) \\ & & & \\ & \varphi\_{0}^{n}\alpha\_{1}\dots\alpha\_{n} \in \varphi\_{0}^{n}\Gamma \cap \alpha\_{1}\Gamma \cap \dots\cap\alpha\_{n}\Gamma & & \tau\_{i} \in \{\varphi\_{0}^{n},\alpha\_{1},\dots,\alpha\_{n}\} \\ \end{array}$$

Intuitively, a term assumption stores the elementary information which is associated with a non-logical constant and the I-rule allows us to establish the truth of an atomic sentence on the basis of this information.

#### **Defnition 2.3** *Derivations in* S*-systems*.

*Basic step*. Any term assumption Γ and any atomic sentence (i.e., a derivation from the open assumption of ) is an S-derivation.

*Induction step*. If D , for ∈ {0, . . . , }, are S-derivations, then an S-derivation can be constructed by means of the I/E-rules displayed above.

*Example 2.4* Let the S-system contain only two predicates (i.e., , ) and two nominal constants (i.e., , ), and let the term assumptions be as follows: Γ = {, }, Γ = {, }, Γ = {, , }, and Γ = {, , }.

$$(3) \qquad \frac{\frac{R\Gamma \quad a\Gamma \quad b\Gamma}{a\Gamma \quad (as\to)} (as\Gamma)}{\frac{R\Gamma}{R\Gamma} \quad (as\to)} \quad \frac{\frac{F a}{a\Gamma} \quad (as\to)}{\frac{Rba}{a\Gamma} \quad (as\to)}$$

Derivation (3) contains two detours and is, therefore, not in normal form (or normal).

**Defnition 2.5** *Detour conversion for* :

$$\begin{array}{cccc} \mathcal{D}\_0 & \mathcal{D}\_1 & & \mathcal{D}\_n \\ \varphi\_0^n \Gamma & \alpha\_1 \Gamma & \dots & \alpha\_n \Gamma \\ \hline & \frac{\varphi\_0^n \alpha\_1 \dots \alpha\_n}{\tau\_i \Gamma} (as \mathcal{E}\_i) & \text{conv} & \tau\_i \Gamma \end{array}$$

**Theorem 2.6** *Any derivation* D *in an* S*-system can be transformed into a normal* S*-derivation.*

*Proof* Immediate. □

**Defnition 2.7** Let D be a derivation in an S-system.


**Theorem 2.8** *If* D *is a normal* S*-derivation of an* S*-unit* <sup>S</sup> *from a set of* S*-units* Γ*, then each* S*-unit in* D *is a subexpression of an expression in* Γ ∪ {S}*.*

*Proof* Immediate. □

#### **2.2 Subatomic identity systems**

The next step of the defnition of the intended kind of reference proof system consists in the extension of subatomic systems to subatomic identity systems by adding I/E-rules for non-primitive identity sentences. Roughly, two nominal constants are identical in this sense if they are indistinguishable with respect to the elementary information associated with them (cf. Defnition 2.2).

**Defnition 2.9** Atomic sentences (1) and (2) are *mirror atomic sentences* if and only if they are exactly alike except that the former contains occurrences of <sup>1</sup> at all the places at which the latter contains occurrences of 2, and vice versa.

**Defnition 2.10** A *subatomic identity system* S =¥ is a 3-tuple ⟨I, R, R =¥ ⟩ which extends a subatomic system with a set R <sup>=</sup>¥ of *I/E-rules for* =¥*-sentences*:

[<sup>1</sup> (1)](1<sup>1</sup> ) [<sup>1</sup> (2)](1<sup>2</sup> ) D1<sup>1</sup> D1<sup>2</sup> <sup>1</sup> (2) <sup>1</sup> (1) . . . [ (1)](<sup>1</sup> ) [ (2)](<sup>2</sup> ) D<sup>1</sup> D<sup>2</sup> (2) (1) (=¥I), 11 , 12 , . . . , <sup>1</sup> , <sup>2</sup> 1=¥<sup>2</sup> D<sup>1</sup> 1=¥<sup>2</sup> D<sup>2</sup> (1) (=¥E1) (2) D<sup>1</sup> 1=¥<sup>2</sup> D<sup>1</sup> (2) (=¥E2) (1)

where ∈ {1, . . . , }, and (1) and (2) are mirror atomic sentences.

*Remark 2.11* In the =¥I/E-rules the operators fguring in the defniens of =¥ (Defnition 2.1) have been absorbed, so to speak, into the metalanguage.

**Defnition 2.12** *Derivations in* S =¥ *-systems*.

*Basic step*. Any derivation in an S-system and any identity sentence =¥ (i.e., a derivation from the open assumption of =¥), where possibly = , is an S =¥ -derivation.

*Induction step*. If D1, D1<sup>1</sup> , D1<sup>2</sup> , . . ., D<sup>1</sup> , D<sup>2</sup> , D<sup>1</sup> , and D<sup>2</sup> are S-derivations, then an S =¥ -derivation can be constructed using the =¥I/E-rules listed above.

*Example 2.13* For simplicity, let the S =¥ -system contain only one predicate (i.e., ) and two nominal constants (i.e., and ). And let the term assumptions be as follows: Γ = {, }, Γ = {}, and Γ = {}.

$$(4) \qquad \frac{\frac{\left[Fa\right]^{\left(\mathbf{l}\_{1}\right)}}{F\Gamma}(as\mathbf{E}\_{0})}{\frac{F\Gamma}{Fb}}(as\mathbf{I}) \quad \frac{\frac{\left[Fb\right]^{\left(\mathbf{l}\_{2}\right)}}{F\Gamma}(as\mathbf{I}\_{0})}{a\mathbf{\tilde{r}}}(as\mathbf{\tilde{I}}\_{0}) \quad \frac{a\Gamma}{Fa}(as\mathbf{\tilde{I}}) \quad (4a\mathbf{\tilde{I}}) $$

$$(\text{S}) \quad \frac{[Fa]^{(1\_1)}}{F\Gamma} (aS\text{E}\_0) \quad \frac{[Fa]^{(1\_1)}}{a\Gamma} (aS\text{E}\_1) \quad \frac{[Fa]^{(1\_2)}}{F\Gamma} (aS\text{E}\_0) \quad \frac{[Fa]^{(1\_2)}}{a\Gamma} (aS\text{E}\_1)$$
 
$$\frac{[Fa]^{(1\_2)}}{a\vec{\text{-}}a} \quad \frac{[Fa]^{(1\_2)}}{a\vec{\text{-}}a} (\vec{\text{-}}\text{I}) \quad \text{(1\_1)}$$

*Example 2.14* Let <sup>1</sup> (), . . . , () ∈ *Atm*() for any ∈ C. The following is a derivation in any S =¥ -system: (6)

$$\frac{\left[\varphi\_{1}(\alpha)\right]^{(1\_{1})}\left[\varphi\_{1}(\alpha)\right]^{(1\_{2})}\dots\ldots\left[\varphi\_{k}(\alpha)\right]^{(k\_{\parallel})}\left[\varphi\_{k}(\alpha)\right]^{(k\_{2})}}{\vec{\alpha}\vec{=}\alpha}(\vec{\equiv}\mathbf{I}),1\_{1},1\_{2},\ldots,k\_{1},k\_{2}$$

*Remark 2.15* According to Example 2.14, =¥ does not need to be postulated as an axiom. In particular, it is not declared, as it is usually the case, a conclusion of a zero premiss I-rule. Rather it is inferred, on a non-empty basis, by appeal to mirror formulae.

**Defnition 2.16** *Detour conversions for* =¥:

[<sup>1</sup> (<sup>1</sup> ) ](11 ) [<sup>1</sup> (<sup>2</sup> ) ](12 ) D11 D12 <sup>1</sup> (<sup>2</sup> ) <sup>1</sup> (<sup>1</sup> ) . . . [ (<sup>1</sup> ) ](<sup>1</sup> ) [ (<sup>2</sup> ) ](<sup>2</sup> ) D<sup>1</sup> D<sup>2</sup> (<sup>2</sup> ) (<sup>1</sup> ) (=¥I) 1=¥ <sup>2</sup> D<sup>2</sup> (<sup>1</sup> ) (=¥E1) (<sup>2</sup> ) conv D<sup>2</sup> [ (<sup>1</sup> ) ] D<sup>1</sup> (<sup>2</sup> ) [<sup>1</sup> (<sup>1</sup> ) ](11 ) [<sup>1</sup> (<sup>2</sup> ) ](12 ) [ (<sup>1</sup> ) ](<sup>1</sup> ) [ (<sup>2</sup> ) ](<sup>2</sup> ) D<sup>1</sup>

$$\begin{array}{ccccc} \mathcal{O}\_{\mathbb{I}\_{1}} & \mathcal{O}\_{\mathbb{I}\_{2}} & \mathcal{O}\_{\mathbb{I}\_{1}} & \mathcal{O}\_{\mathbb{I}\_{2}} & & \downarrow\_{\mathbb{I}\_{1}}\\ \hline \varphi\_{1}(\alpha\_{2}) & \varphi\_{1}(\alpha\_{1}) & \dots & \varphi\_{k}(\alpha\_{2}) & \varphi\_{k}(\alpha\_{1}) & \text{( $\mathbb{H}\_{1}$ )}\\ & & \underline{\alpha\_{1} \exists \alpha\_{2}} & & \varphi\_{i}(\alpha\_{2}) & \underset{\varphi\_{i}(\alpha\_{2})}{\operatorname{( $\mathbb{H}\_{2}$ )}} \; (\exists \mathbb{E}\_{i} 2) & & & \varphi\_{i}(\alpha\_{1}) \end{array}$$

**Theorem 2.17** *Any derivation* D *in an* S =¥ *-system can be transformed into a normal* S =¥ *-derivation.*

*Proof* Cf. Więckowski (2016). □

**Defnition 2.18** Let D be a derivation in an S =¥ -system.


**Theorem 2.19** *If* D *is a normal* S =¥ *-derivation of an* S =¥ *-unit* S=¥ *from a set of* S =¥ *-units* Γ*, then each* S =¥ *-unit in* D *is a subexpression of an expression in* Γ ∪ {S=¥ }*.*

*Proof* Cf. Więckowski (2016). □

#### **2.3 Subatomic natural deduction systems**

We now complete the defnition of the intended kind of reference proof system. In order to reduce complexity and to focus on the main idea underlying modal proof systems, we defne these reference systems, **I**(S=¥ )-systems, only for a fragment of 0.

**Defnition 2.20** *The language* 0 ′ . 0 ′ is the fragment of 0 which comprises only ⊥, atomic, =¥-, and ⊃-formulae.

#### **Defnition 2.21** *Derivations in I*(S=¥ )*-systems*.

*Basic step*. Any derivation in anS =¥ -system and any 0 ′ -formula (i.e., a derivation from the open assumption of ) is an **I**(S=¥ )-derivation.

*Induction step*. If D<sup>1</sup> and D<sup>2</sup> are **I**(S=¥ )-derivations, then an **I**(S=¥ )-derivation can be constructed by means of the following rules:

$$\begin{array}{ccccc} [A]^{(u)} & & & \\ \mathcal{D}\_{1} & & \mathcal{D}\_{1} & \mathcal{D}\_{2} & & \mathcal{D}\_{1} \\ \frac{\mathcal{B}}{A \supset B}(\supset \mathrm{I}), u & & \frac{A \supset B \quad A}{B}(\supset \mathrm{E}) & & \frac{\perp}{A}(\sqcup \mathrm{i}) \\ \end{array}$$

	- 2. A canonical derivation D of in an **I**(S=¥ )-system is a *canonical proof* of in that system if there are no applications of -rules and no undischarged assumptions in D.
	- 3. The conclusions of canonical **I**(S=¥ )-derivations are **I**(S=¥ )-*theses* and the conclusions of **I**(S=¥ )-proofs are also **I**(S=¥ )-*theorems*.

*Example 2.23* Let the **I**(S=¥ )-system maintain only one predicate (i.e., ) and two nominal constants (i.e., , ). Let the term assumptions be like in Example 2.13: Γ = {, }, Γ = {}, and Γ = {}.

Counterfactual Assumptions and Counterfactual Implications 407

$$(\text{7)} \qquad \frac{\frac{\begin{bmatrix} a\ddot{\overline{\phantom{a}}}a\Big]^{(1)} & \left[Fa\right]^{(2\_{1})} \\ \frac{Fa}{F\Gamma}\left(as\text{E}\_{0}\right) & b\Gamma \end{bmatrix}}{\frac{Fb}{}} \quad \frac{\frac{\left[Fb\right]^{(2\_{2})}}{F\Gamma}\left(as\text{I}\right)}{Fa} \quad \frac{a\Gamma}{}(as\text{I}) \quad \frac{a\Gamma}{}(s\text{I}) . \qquad (s\text{I}) . \qquad (\text{7)}$$

$$(8)\tag{8} \qquad \qquad \qquad \frac{\left[a \ddot{\equiv} b\right]^{(1)}}{a \ddot{\equiv} b \supset a \ddot{\equiv} b} \text{ (\supset I), 1)$$

It can be readily verifed, largely relying on standard methods (cf. Prawitz, 1965; Troelstra and Schwichtenberg, 2000), that derivations in **I**(S=¥ )-systems can be transformed into normal derivations and that normal derivations possess the subexpression/subformula property.

**Defnition 2.24** *Detour conversions in I*(S=¥ )*-systems*. The detour conversions for and =¥ are like those in Defnitions 2.5 and 2.16. These are supplemented with the detour conversion for ⊃:

$$\begin{array}{cc} \begin{array}{c} \left[A\right]^{(u)} \\ \mathcal{D}\_{1} \\ \hline A \supset B \end{array} \left(\begin{array}{c} \left[\mathrm{I}\right] \\ \left[\mathrm{I}\right] \\ \hline B \end{array}\right) , \begin{array}{c} \mathcal{D}\_{2} \\ \hline B\_{2} \\ \hline A \end{array} \left(\begin{array}{c} \left[\mathrm{I}\right] \\ \hline B\_{1} \\ \hline B \end{array}\right) \\ \end{array} \left(\begin{array}{c} \left[\mathrm{I}\right] \\ \hline B \end{array}\right) \\ \end{array}$$

**Theorem 2.25** *Normalization for I*(S=¥ )*-systems: Any derivation* D *in an I*(S=¥ ) *system can be transformed into a normal I*(S=¥ )*-derivation.*

*Proof* A consequence of the normalization proof in Więckowski (2016). □

*Remark 2.26* (3) is neither normal nor canonical, but it can be transformed into a normal **I**(S=¥ )-derivation (that is not canonical). (4)-(5) and (7)-(8) are normal canonical **I**(S=¥ )-derivations. (6) and (8) have the form of normal canonical **I**(S=¥ ) proofs.

**Defnition 2.27** Let D be a derivation in an **I**(S=¥ )-system.


**Theorem 2.28** *Subexpression property for I*(S=¥ )*-systems: If* D *is a normal I*(S=¥ ) *derivation of a unit from a set of units* Γ*, then each unit in* D *is a subexpression of an expression in* Γ ∪ {}*.*

*Proof* A consequence of the corresponding proof in Więckowski (2016). □

**Corollary 2.29** *Subformula property for I*(S=¥ )*-systems: If* D *is a normal I*(S=¥ ) *derivation of formula from a set of formulae* Γ*, then each formula in* D *is a subformula of a formula in* Γ ∪ {}*.*

*Remark 2.30* (4)-(8) possess the subexpression property, and so does (3) after its transformation into a normal derivation. Concerning the subformula property, we have to bear in mind that =¥-formulae are abbreviations according to Defnition 2.1.

#### **3 Modal proof systems**

Modal proof systems, as we shall understand them, are proof systems which, given a reference proof system that serves to determine what counts as a fact, distinguish between various modes of making assumptions. Section 3.1 defnes modal proof systems for reasoning with elementary *would*-counterfactuals and causal *since*subordinator sentences. Section 3.2 formulates a proof-theoretic semantics for such constructions.

#### **3.1 IFC-systems**

The natural deduction systems to be defned in this section maintain three modes of assumption. Derivations in these systems refect, as it were, how the modes of assumptions are related to the moods of implications. The systems are defned for the language 1.

**Defnition 3.1** *The language* 1. The notion of a formula of 1 is inductively defned by the following clauses:


Note that we do not use a diferent symbol for mode-sensitive implication. Let ◦ ∈ {⊃ , ⊃, ⊃}. Call the ◦-operators *implication-operators* and formulae with principal ◦ *implication-formulae*. *Defned operators of* 1: ¬ =*def* ⊃ ⊥ (factual negation), ¬ =*def* ⊃ ⊥ (counterfactual negation), ¬ =*def* ⊃ ⊥ (mode-sensitive negation).

For instance, we symbolize sentences of the form (1) by ⊃ and sentences of the form (2) by ⊃ .

**Defnition 3.2** Let *Fml*<sup>0</sup> be the set of formulae of 0 ′ and let *Fml*<sup>1</sup> be the set of formulae of 1. An 1-formula is a *modal formula* in case ∈ *Fml*<sup>1</sup> \ *Fml*0.

We now defne the intended modal proof systems.

**Defnition 3.3** An IFC-*system* is a modal natural deduction system for intuitionistic factual, counterfactual, and mode-sensitive implication which, given a reference

proof system (Defnition 3.4), distinguishes three modes of making assumptions (Defnition 3.5): factual, counterfactual, and independent. Derivations in IFC-systems are defned on the basis of these modes (Defnition 3.7).

**Defnition 3.4** *Reference proof system S*. LetS be an **I**(S=¥ )-system. Let an *established thesis of* S be an 0 ′ -formula for which a canonical S-derivation (Defnition 2.22(1)) has been constructed. And let Θ<sup>S</sup> be the set of established theses (or facts).

**Defnition 3.5** *Modes of assumptions (IFC-systems)*. There are three modes of making assumptions in IFC-systems:


We write // to indicate that is assumed in one of the three modes.

*Remark 3.6* A consequence of Defnition 3.5 is that modal 1-formulae can be assumed only in the independent mode.

#### **Defnition 3.7** *Derivations in* IFC*-systems*.

*Basic step*. Any derivation in the reference system S of an IFC-system, any 0 ′ -formula assumed in the factual (resp. counterfactual) mode || (≀≀), i.e., a derivation from the open factual (counterfactual) assumption of , and any 1-formula assumed in the independent mode, i.e., a derivation from the open independent assumption of , is a derivation in that IFC-system.

*Induction step*. If D<sup>1</sup> and D<sup>2</sup> are IFC-derivations, then an IFC-derivation can be constructed by means of I/E-rules for and =¥, which now also take the modes of assumptions into account, and the following rules:

$$\begin{array}{ccccc} & [[A]]^{(u)} & & [[At]]^{(u)} & & \\ \mathcal{D}\_{1} & \mathcal{D}\_{1} & \mathcal{D}\_{2} & & \mathcal{D}\_{1} & \mathcal{D}\_{2} \\ \hline \mathcal{A}\supset\_{f}\mathcal{B} & (\supset\_{f}\mathcal{I}), u & \frac{A\supset\_{f}B}{B} \left(\supset\_{f}\mathcal{E}\right) & \frac{B}{A\supset\_{c}B} \left(\supset\_{c}\mathcal{I}\right), u & \frac{A\supset\_{c}B}{B}\left(\supset\_{c}\mathcal{E}\right) \\ \end{array}$$

[//]()

$$\begin{array}{cccc} \mathcal{D}\_{\mathrm{I}} & & \mathcal{D}\_{\mathrm{I}} & \mathcal{D}\_{\mathrm{2}} & & \mathcal{D}\_{\mathrm{I}}\\ \frac{\mathcal{B}}{A \supset B} (\supset \mathrm{I}), u & & \frac{A \supset B \quad A}{B} (\supset \mathrm{E}) & & & \frac{\perp}{A} (\sqcup \mathrm{i}) \end{array}$$

1. *Side conditions*:

SC1. ⊃ I: No empty discharge; and no empty discharge contained in D1.

SC2. ⊃ E: The minor premiss has *factual status*, i.e.: depends on no counterfactual assumption in D2; and either depends on at least one factual assumption in D2, or D<sup>2</sup> contains at least one term assumption, or D<sup>2</sup> is a derivation in S.

SC3. ⊃I: Like SC1.

	- AP1. No formula is assumed in more than one mode in D.
	- AP2. The mode in which an antecedent is assumed in ◦I-applications in D determines the *modal status* (factual, counterfactual, independent) of all antecedent -nodes (i.e., minor premisses of ◦E-applications) in D.

*Remark 3.8* 1. *Factual implication*: SC1 ensures that is indeed assumed factually and that factual implication behaves inferentially in an intuitively required nonmonotonic manner. SC2 ensures that the minor premiss is rooted in facts and does not rest on the unestablished.

2. *Counterfactual implication*: SC3 ensures that is indeed assumed counterfactually and that counterfactual implication behaves non-monotonically. SC4a ensures that the minor premiss is neither based entirely on facts nor on assumptions made in the independent mode. SC4b excludes break formulae in order to block transitivity.

3. *(Mode-sensitive) implication*: There are no side conditions on the I/E-rules for ⊃. The ⊃-rules of S-systems are special cases of the ⊃-rules of IFC-systems. We may regard these special cases, allowing only for independent assumptions, as governing ⊃ in the usual (i.e., mode-less) sense of implication. As mentioned, if is a modal 1-formula (Defnition 3.2), [//]() in an ⊃I-application can be only of the form [] () .

4. We obtain a minimal system (abbr. MFC-system) from an IFC-system, if we remove the ⊥i-rule from the latter.

**Defnition 3.9** *Canonical derivation, canonical proof, thesis, and theorem (*IFC*systems)*. Analogous to Defnition 2.22.

$$\begin{array}{l} \text{Example 3.10} \\ \frac{[[Rab]]^{(1)}}{R\Gamma}(aS\mathbf{E}\_{0}) \, \frac{[[Fc]]^{(2)}}{c\Gamma}(aS\mathbf{E}\_{1}) \, \frac{d\Gamma}{d\Gamma}(aS\mathbf{I}) \\ \frac{Rcd}{\overline{Fc\supset\_{f}}\,\, Rcd}(\supset\_{f}\mathbf{I}), \mathbf{2} \\ (9) \\ \frac{Rab\ \supset\_{c}(\overline{Fc\supset\_{f}}\,\, Rcd)}{\overline{Fc\supset\_{f}}\,\, Rcd}(\supset\_{c}\mathbf{I}), \mathbf{1} \\ \frac{\overline{Fc\supset\_{f}}\,\, Rcd}{\overline{c\Gamma}}(aS\mathbf{E}\_{1}) \end{array} (\supset\_{c}\mathbf{E}) \, \, \frac{\overline{F\Gamma}\,\, \, c\Gamma}{\overline{Fc}}(aS\mathbf{I}) \, \, \, \end{array} (as\mathbf{I})$$

Counterfactual Assumptions and Counterfactual Implications 411

(10) (a) [¬] (2) [≀≀](1) (⊃E) <sup>⊥</sup> (⊃I), 2 ¬¬ (⊃I), 1 <sup>⊃</sup> ¬¬

$$\begin{array}{c} \text{(b)} \qquad \frac{\begin{bmatrix} \begin{smallmatrix} \begin{smallmatrix} \begin{smallmatrix} A \ \end{smallmatrix} \end{smallmatrix} \end{smallmatrix} \begin{smallmatrix} B \end{smallmatrix} \end{array} \begin{smallmatrix} A \ \begin{smallmatrix} A \ \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{smallmatrix} \begin{smallmatrix} B \end{smallmatrix} \end{array} \begin{smallmatrix} \begin{smallmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{)} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{pmatrix} \end{pmatrix} \begin{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{pmatrix} \end{pmatrix}$$
 
$$\begin{array}{c} \frac{\begin{smallmatrix} \begin{smallmatrix} \begin{smallmatrix} A & \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{pmatrix} \begin{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} B \end{smallmatrix} \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{pmatrix} \end{pmatrix}$$
 
$$\begin{array}{c} \frac{\begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{smallmatrix} \end{array} \begin{smallmatrix} \begin{smallmatrix} B \end{smallmatrix} \end{smallmatrix} \end{array} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} B \end{smallmatrix} \end{smallmatrix} \end{pmatrix} \begin{smallmatrix} \begin{smallmatrix} A \end{smallmatrix} \end{gathered} \begin{smallmatrix} \begin{smallmatrix} B \end{smallmatrix} \end{smallmatrix} \end{aligned}$$

The side conditions imposed on the ⊃-rules guarantee that IFC-systems do justice to traditional *counterfactual fallacies* (cf. Stalnaker, 1968, pp. 48–49; see also Lewis, 2011, §1.8). The failures of transitivity and contraposition are refected by the illegal (11a) and (11b), respectively. In both derivations is a break formula. (12a) and (12b) contain violations of weakening and are related to the fallacy of strengthening of the antecedent.

$$(\text{l1})\qquad\qquad\qquad\text{(a)}\qquad\frac{\begin{array}{c}\text{A}\ \supset\_{\text{c}}\ \text{B}\end{array}\begin{array}{c}\text{A}\ \supset\_{\text{c}}\ \text{B}\end{array}\begin{array}{c}\text{[}\![\text{Al}]\text{]}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{(}\text{l}\text{)}\text{($$

(b) [≀¬≀](2) [ ⊃ ] (1) [≀≀](3) (⊃E) *illeg.* (⊃E) <sup>⊥</sup> (⊃I), 3 ¬ (⊃I), 2 ¬ ⊃ ¬ (⊃I), 1 ( <sup>⊃</sup> ) ⊃ (¬ <sup>⊃</sup> <sup>¬</sup>)

$$\begin{array}{cc} \text{(12)} & \begin{array}{c} \text{(12)} \end{array} & \begin{array}{c} \text{(12)} \end{array} \\ \end{array} \\ \begin{array}{c} \text{(a)} \end{array} & \begin{array}{c} \frac{\begin{bmatrix} \text{(} \land \text{I} \text{)} \text{(} \text{)} \end{bmatrix}}{\begin{B} \text{ } \supset\_{\text{c}} \text{ A}} \left( \supset\_{\text{c}} \text{I} \right) \text{ (} \text{illeg.} \\ \begin{array}{c} \text{(} \supset\_{\text{c}} \text{I)} \end{array} \end{array} \\ \end{array} \\ \begin{array}{c} \text{( } \supset\_{\text{c}} \text{I)} \text{ ( } \text{\( \supset\_{\text{c}} \text{I)} \text{)} \text{ 1 } \end{array} \end{array}$$

$$\begin{array}{cc} \text{(b)} & \frac{\left[\neg(A \supset\_{c} B)\right]^{(1)}}{A \supset\_{c} B} \left(\supset\_{c} \text{I)} \text{ } \text{illeg.}\\ & & \frac{\bot}{\neg B} \text{ } (\supset \text{I)}, 2\\ & & \frac{\left[\neg B\right]}{A \supset\_{c} \neg B} \left(\supset\_{c} \text{I)} \text{ } \text{illeg.}\\ & & \neg(A \supset\_{c} B) \supset (A \supset\_{c} \neg B) \end{array} \right. \\ \text{(c)} & 1 \\ \hline \end{array}$$

Derivations in IFC-systems may contain detours (i.e., cut or maximum formulae). (9) is an example. Such derivations can be transformed into normal derivations.

**Defnition 3.11** 1. The occurrence of a formula in a derivation D in and IFC-system is a *cut (or maximum) formula*, if it is the conclusion of an application of an I-rule and the (major) premiss of an E-rule. A *maximal* cut formula in D is a cut formula with maximal rank .

2. The *cut rank* of D, (D), is ⟨, ⟩, where = {() : cut formula in D}, and where is the number of maximal cut formulae in D. A derivation is *normal (or in normal form)*, if it contains no cut formulae.

**Defnition 3.12** *Detour conversions in* IFC*-systems*. The detour conversions for and =¥ are like those in Defnitions 2.5 and 2.16, except that assumptions of - and =¥-formulae now occur in //. These are supplemented with detour conversions for the ◦-operators:


*Remark 3.13* 1. The ⊃-conversion of S-systems (Defnition 3.12) is a special case of the ⊃-conversion of IFC-systems.

2. Recall that, in virtue of AP1, a formula can be only assumed in exactly one mode in a derivation. Because of [||]() , the minor premiss of the ⊃ E-application in the ⊃ -conversion has, by AP2, factual status. Similarly, since we have [≀≀]() on the left-hand side of the ⊃-conversion, the minor premiss of the ⊃E-application in the ⊃-conversion has counterfactual status. Finally, in the ⊃-conversion, is assumed in exactly one of the three modes in [//]() . By AP2, this mode determines the modal status of the two -nodes in the derivation on the left-hand side of the conversion.

The following considerations supplement Remark 3.13(2).

*Remark 3.14* Let be an implication-formula ◦ . Let D be a derivation which derives by means of an ◦I-application in its last step, and let D be the subderivation of D which derives the premiss of that ◦I-application. Let D be a derivation which derives . The tables below list the cases in which an ◦E-application can be used to construct a derivation D<sup>∗</sup> of (i.e., a detour derivation with being a detour formula) from derivations D and D. (Since no mode-related side conditions are

imposed on the I/E-rules for - and =¥-formulae, we consider only cases in which and have been obtained by means of ◦-rules.) The columns with the heading D (resp. D) indicate the last rule applied in D (D). '+' means that the construction of D<sup>∗</sup> is legal and that D<sup>∗</sup> 's conversion is successful. '−' means that a conversion is precluded, since the construction is not legal. In case D ends with an ◦E-application there are two entries. The frst [second] entry indicates the result for the case in which is introduced discharging an assumption used to derive the major [minor] premiss of that ◦E-application.



**Theorem 3.15** *Normalization for* IFC*-systems: Any derivation* D *in an* IFC*-system can be transformed into a normal* IFC*-derivation.*

*Proof* We proceed in the familiar way by applying detour conversions (Defnition 3.12) to D, in order to arrive at (D) = ⟨0, 0⟩. □

(9), for instance, can be transformed into Γ, a derivation in normal form. (10a) and (10b) are normal derivations.

Normal derivations have a simple structure. It can be shown, adapting standard methods (cf. Prawitz, 1965; Troelstra and Schwichtenberg, 2000), that they possess the subexpression property (of which the subformula property is a special case).

**Defnition 3.16** Let D be a derivation in an IFC-system.


**Defnition 3.17** Let D be a normal derivation in an IFC-system, let ◦ ∈ {⊃ , ⊃, ⊃}. A sequence of unit occurrences 0, . . . , such that


is a *track* of D. A track of *order* 0 in D is a track ending in the conclusion of D. A track of *order* + 1 in D is a track ending in the minor premiss of an application of =¥E or ◦E with the major premiss belonging to a track of order .

**Theorem 3.18** *Let* D *be a normal derivation in an* IFC*-system and let be a track* 0, . . . , *in* D*. Then there is a unit in , the minimum part of the track, which divides into two (possibly empty) parts, an E-part* 0, . . . , −<sup>1</sup> *and an I-part* +1, . . . , *. The E-part is constructed exclusively by E-rule applications. The I-part is constructed exclusively by I-rule applications. is the conclusion of an E-rule, and in case* < *, a premiss of an I-rule or of* ⊥*i.*

*Proof* By Theorem 3.15, a major premiss of an E-rule application cannot be a conclusion of an I-rule application. The result is a consequence of this insight. □

**Theorem 3.19** *Subexpression property for* IFC*-systems: If* D *is a normal* IFC*derivation of a unit from a set of units* Γ*, then each unit in* D *is a subexpression of an expression in* Γ ∪ {}*.*

*Proof* Making use of Theorem 3.18, the result is established by induction of the order of tracks . □

**Corollary 3.20** *Subformula property for*IFC*-systems: If* D *is a normal*IFC*-derivation of formula from a set of formulae* Γ*, then each formula in* D *is a subformula of a formula in* Γ ∪ {}*.*

As a consequence of Theorem 3.19, full analyticity can be claimed for the systems. We may use, relying on Corollary 3.20, the following method (cf. Więckowski, 2021a) in order to show that a formula of the language of IFC-systems cannot be derived as a theorem in these systems:

**Defnition 3.21** *Method of counter-derivations*. Construct a candidate for a normal canonical IFC-proof of formula by proceeding bottom-up using the rules for the operators ignoring the side conditions on them. In case (i) the construction has been successful, check whether the candidate violates a side condition. If this is the case, (ia) we obtain a counter-derivation for , otherwise (ib) we obtain a normal IFC-proof of . In case (ii) the construction of a candidate has not been successful, we may conclude that cannot be derived as a theorem. Consequently, we get a decision concerning the IFC-derivability of as a theorem. It is derivable as a theorem in case (ib), and underivable in cases (ia) and (ii).

*Remark 3.22* Some of the derivations in (11) and (12) can be seen as counterderivations which show that their conclusions are not theorems.

*Remark 3.23* 1. As mentioned in Section 1, the idea of using diferent ways of making assumptions in the context of natural deduction for counterfactuals can be traced back at least to Thomason's (1970) FCS. Crucially, Thomason introduces the notion of a *strict derivation*. He takes it that in an ordinary derivation from an assumption , we suppose that is the case in the actual situation. By contrast, in a strict derivation, we may "hold in abeyance certain portions of our knowledge about our actual situation, and envisage another situation in which something is supposed to hold" (Thomason, 1970, p. 398). Since it may happen that in the envisaged alternative situations not all our knowledge about the actual situation is available, Thomason imposes restrictions on the availability of that knowledge in strict derivations. Formally, this is achieved by introducing special *reiteration rules* which govern reiteration into strict derivations. One may infer a *would*-counterfactual (Thomason's notation: > ), by means of an introduction rule on the basis of a strict derivation of from the, as it were, "strict" assumption of . In particular, one may assume also known propositions in this counterfactual way. Thomason establishes the equivalence of FCS and CS. However, he does not discuss the proof-theoretic properties of FCS.

2. It seems possible to use factual implication for the formal analysis of those constructions of the form (2) in which "since" can be equivalently replaced by "because". For discussion ofthe relation between these two kinds of causal subordinator see, e.g., Dancygier and Sweetser (2000) and Guillaume (2013). For an outline of a formal system for reasoning with "because" see Schnieder (2011).

*Remark 3.24* In defning modal natural deduction systems there are several decisions to be made. These may concern, for instance, the choice of the reference proof system, the conception of an established fact, the modes, in which modal formulae can be legally assumed, the shape of the rules, or the side conditions that are to be imposed on them. Thus, the complexity that pertains to formal accounts of counterfactual reasoning is not moved to conditions on external (e.g., model-theoretic) structures, but rather enters the systems via such proof-theoretic design options.

#### **3.2 A proof-theoretic semantics**

On the basis of Theorem 3.15, we may formulate a proof-theoretic semantics for the non-logical constants, the atomic sentences, the identity sentences, and for the formulae composed of the operators of IFC-systems.

**Defnition 3.25** *Meaning*: Let the modal proof system be an IFC-system.


*Remark 3.26* 1. Defnition 3.25 contains a proof-theoretic semantics for the nonlogical constants and formulae of 0 ′ , defned in terms of S-derivations, as a special case.

2. Since meaning is defned in terms of canonical derivations (cf. Dummett, 1991; Prawitz, 2006), the semantics specifed above is acceptable from an *intuitionistic* point of view.

3. The proposed proof-theoretic semantics is *semantically autarkic*, since the modal natural deduction systems do not draw on a formal semantics of a diferent kind (e.g., a possible worlds similarity semantics; cf. Lewis, 2011; Nute and Cross, 2001; Stalnaker, 1968; Stalnaker and Thomason, 1970). For instance, labelled (e.g., Negri and Olivetti, 2015; Negri and Sbardolini, 2016; Poggiolesi, 2016) or internal (e.g., Lellmann and Pattinson, 2012; Olivetti and Pozzato, 2015) structural proof systems for standard counterfactual logics all of which are formulated in a classical context do not allow for an autarkic proof-theoretic semantics. Labelled proof systems incorporate model-theoretic structures in terms of which truth conditions are formulated into their rules by means of labels (for worlds) and labelled formulae (for similarity). A proof-theoretic semantics based on such a calculus (envisaged in Girlando, Negri, and Olivetti, 2018) would certainly not be autarkic. By contrast, internal proof systems for a given counterfactual logic can be characterized as not involving a syntax that cannot be defned in terms of the object language of that logic. As a result, the sequents of such a calculus do not wear their genesis on their sleeves. However, the internal systems mentioned above make use of structural operators and specifc rules which directly imitate model-theoretic structures involved in the semantics. (Translations of internal into labelled systems and back are considered in Girlando, 2019; Girlando, Negri, and Olivetti, 2018.) From a foundational point of view—or seeing proof-theoretic semantics as an "alternative to truth-condition semantics" (Schroeder-Heister, 2018, p. 1)—neither an internalization of model-theoretic truth conditions nor an imitation of model-theoretic structures seems to be appealing.

**Defnition 3.27** A [subatomic] proof system is *meaning-integral*, if a proof-theoretic semantics is available for it that is based on [term assumptions and] canonical derivations.

Given a meaning-integral proof system, we may defne a notion of *derivation-based intuitionistic truth* (cf. Więckowski, 2023).

**Defnition 3.28** Let be an IFC-system and call { : Γ ⊢ }, i.e., the set of formulae which have been canonically derived in from a set of units Γ, the *canonical -set*. The -*truth* of *with respect to* Γ is defned by: Γ ⊩ =*def* ∈ { : Γ ⊢ }. Special case: is a *logical -truth* (i.e., ⊩ ) in case is the conclusion of a canonical -proof (cf. Defnition 3.9).

*Remark 3.29* -truth may serve to single out certain atomic sentences, identity sentences, and logical compounds. And it is logical -truth, rather than plain -truth, that can be used to single out certain -true identity sentences (i.e., self-identities) and logical compounds further. In general, the canonical derivations on which -truths are based can be seen as formal verifcations (for verifcationism in the context of proof-theoretic semantics see, e.g., Dummett, 1991, Prawitz 2006; 2012).

#### **4 A philosophical application: counterpossibles**

We shall now apply the modal proof systems developed in the previous section to the following constructions (cf. Williamson, 2007, p. 174):


Given certain philosophical presuppositions, in particular, the doctrine of the *necessity of identity* (NI, for short; cf. Kripke, 1980; Marcus, 1961), conditional sentences of this kind are sometimes called "counterpossibles" (see, e.g., Berto, French, Priest, and Ripley, 2018; Williamson, 2007), since their antecedents turn out to be impossible: If Hesperus is Phosphorus, then, given NI, this is so of necessity. Moreover, given the interdefnability of the operators for necessity and possibility (cf. Williamson, 2007, p. 295), guaranteed by *classical* modal logic, "their" distinctness is, therefore, impossible.

A further common presupposition in the discussion of counterpossibles is "orthodoxy" (cf. Berto, French, Priest, and Ripley, 2018, p. 694), that is, the aforementioned similarity semantics for counterfactuals. A semantics of this kind makes use of truth conditions and explains the formal meaning of counterfactuals in terms of subset relations on possible worlds. Roughly, a sentence of the form (1) is true at world exactly if is true in all the possible worlds in which is true that are most similar to . On this account, there are no -worlds, in case is impossible. This means that an instance of (1) with impossible is true, since, given orthodoxy, for any , is true (vacuously) at all the most similar -worlds. So-called *vacuists* (e.g., Williamson, 2007) accept this consequence for both (13) and (14). By contrast,

*non-vacuists* argue, typically admitting also impossible worlds into the orthodox picture (see, e.g., Berto, French, Priest, and Ripley, 2018 and the references therein), that the frst counterpossible is false and that the second is true. Our discussion of (13) and (14) below will presuppose neither NI, nor classicality, nor orthodoxy.

We symbolize (13) and (14) as ¬=¥ ⊃ ¬=¥ and ¬=¥ ⊃ ¬=¥ , respectively. For simplicity, let the reference proof system S of the IFC-system contain only a single predicate constant (i.e., ) and two nominal constants (i.e., , ). Moreover, let the term assumptions be, again, like in Example 2.13: Γ = {, }, Γ = {}, and Γ = {}. (15) is a canonical derivation for (13):

$$(\text{15})\quad \frac{\frac{\left[a\triangleq a\right]^{(2)}\quad\left[Fa\right]^{(3)}\quad\left(\text{\(\text{i}\)}\right)}{\frac{Fa}{F\Gamma}\left(as\to\text{E}\_{0}\right)}\quad(\text{asI})}{\frac{Fb}{Fb}\quad(as\text{I})}\quad\frac{\frac{\left[Fb\right]^{(3)}}{F\Gamma}\left(as\to\text{E}\_{0}\right)}{\frac{F\Gamma}{a}\left(as\to\text{E}\_{0}\right)}\quad(as\text{I})}{\left(\text{\(\text{i}\)}\right)}\text{(\(\text{i}\),\ \text{\(\text{i}\)},\ \text{\(\text{i}\)}}\quad(\text{s}\text{I})}$$

The conclusion of (15) is only a *thesis* of the specifc IFC-system. (An alternative derivation would be, e.g., one in which the subderivation of =¥ in (15) were replaced by |=¥ |, or one in which it were replaced, e.g., by the S-derivation (4).) Note that we would not be in a position to assume ¬=¥ in the counterfactual mode, if we had established ¬=¥ as a fact. If we had done so, we would not be in a position to arrive at the intended conclusion. (16) is a canonical derivation for (14). Its conclusion is a *theorem* of the IFC-system:

(16) 
$$\frac{\left[\begin{smallmatrix} \iota \neg a \ddot{\Rightarrow} b \end{smallmatrix}\right]^{(1)}}{\neg a \ddot{\Rightarrow} b \supset\_{c} \neg a \ddot{\Rightarrow} b} (\supset\_{c} 1), 1$$

*Comments*:

1. We may regard both (13) and (14) as true (cf. Defnition 3.28). The latter can be taken to be also logically true, since its canonical derivation is a proof. Thus, the present assessment of these sentences as true seems to be closer to the vacuist one. Note, however, that no appeal to some notion of vacuity is being made in the explanation of their truth.

2. (15) shows how the *self-distinctness* of Hesperus can be inferred form the counterfactual assumption of the distinctness of Hesperus and Phosphorus. On the present semantics, (15) is one of those derivations which constitute the meaning of its conclusion.

3. By Defnition 3.25, the meaning of ¬=¥ ⊃ ¬=¥ does not coincide with that of ¬=¥ ⊃ ¬=¥ , nor do the meanings of any two theorems, or those of any two logically equivalent formulae. As a consequence, the present semantics is sensitive to *hyperintensional* distinctions (a recent collection on hyperintensionality is Duží and Jespersen, 2015; for discussion in the context of proof-theoretic semantics see, e.g., Pezlar, 2018).

4. Consider the structure of (4) and (6) which canonically derive =¥ and =¥ (an instance of =¥), respectively. =¥ can be derived as a theorem in any IFC-system, whereas =¥ cannot be derived as a theorem in any such system. If we regard formulae that have been derived as theorems as *necessities*, and those that have been derived only as theses as *contingencies*, we may classify =¥ as necessary and =¥ as contingent.

5. In order to derive =¥ canonically, we have to look to the term assumptions and to apply the I-rule. However, in order to derive =¥ canonically these steps are not required. If we regard, taking a derivation-oriented perspective, the conclusion of a canonical derivation as *a posteriori* in case the derivation requires an application of an -rule, and if we regard the conclusion of a canonical derivation as *a priori* in case the derivation does not need to make use of such rules, we may classify =¥ as *a posteriori* and =¥ as *a priori*. Moreover, depending on whether the constants symbolize denoting or non-denoting names, we may also distinguish between a denotational (or referential) and a non-denotational kind of the *a posteriori*. Furthermore, taking this derivation-oriented perspective, we may also consider adding a distinction between an empirical and a non-empirical kind of the denotational *a posteriori*.

6. The above categorization of 'Hesperus is Hesperus' as necessary *a priori* and of 'Hesperus is Phosphorus' as contingent *a posteriori* is relatively old-fashioned in nature. It difers from Kripke's well-known proposal (cf. Kripke, 1980), according to which, given NI (and other prerequisites), 'Hesperus is Phosphorus' expresses an *a posteriori* necessity.

7. Analogous remarks apply to instances of (13) and (14) which feature *empty names* (e.g., let symbolize 'Superman' and 'Clark Kent'). For such instances, the idea that a proper name is a rigid designator (and so denotes the same object in every possible world) which lies at the hart of NI, does not seem to be appealing as, intuitively, there is nothing for such names to designate whether rigidly or not.

We shall next look at constructions related to (13) and (14), in order to obtain a sharper contrast. First, consider the following counterfactuals:


We symbolize (17) and (18) as =¥ ⊃ =¥ and =¥ ⊃ =¥ , respectively. Let the IFC-system be like that in the analysis of (13) and (14), but with Γ = {}, Γ = {}, and Γ = {}.

$$(19) \qquad \frac{\frac{\left[\left[a\vec{\mathbf{\tilde{}}}\,\mathbf{\tilde{}}\,\mathbf{\tilde{}}\,\mathbf{\tilde{}}\right]^{(1)}\right]}{F\mathbf{\tilde{}}}(as\mathbf{\tilde{E}}\_{0})}{\frac{\left[\begin{array}{c}F\mathbf{\tilde{}}\,\right]}{F\mathbf{\tilde{}}}(as\mathbf{\tilde{E}}\_{0})}{a\mathbf{\tilde{I}}}(as\mathbf{\tilde{I}})}(as\mathbf{\tilde{I}}) \qquad \frac{\left[\begin{array}{c}Fa\right]^{(2\_{2})}}{F\mathbf{\tilde{I}}}(as\mathbf{\tilde{E}}\_{0})}{Fa\mathbf{\tilde{}}}(as\mathbf{\tilde{I}})\end{array}}{\frac{a\mathbf{\tilde{}}\,\mathbf{\tilde{}}\,a}{a\mathbf{\tilde{}}\,a}(\mathbf{\tilde{I}}\_{c}\mathbf{\tilde{I}}),1}(s\mathbf{\tilde{I}})$$

$$\frac{\frac{\frac{\Gamma}{\{a \triangleq a\}}{\{a \triangleq a\}}^{\{1\}} \text{illeg.} \quad \left[F a\right]^{\{2\_{1}\}}}{\frac{\frac{\Gamma}{F \Gamma} \left(a s \to\_{0} \right)}{F \Gamma} \quad \left(a \to\_{1} \right) \text{illeg.}} \quad \frac{\frac{\{F b\}^{\{2\_{2}\}}}{\frac{F \Gamma}{F \Gamma} \left(a s \to\_{0} \right)} \text{(a S.)}}{\frac{a \ddot{=} b}{a \ddot{=} a \supset\_{c} a \ddot{=} b} \left(\Box\_{c} \text{I)}, 1} \text{(\"I\"{.} ), 2\_{1}, 2\_{2}\text{)}}$$

In the present IFC-system, we can neither make a factual assumption of =¥ nor can we derive that formula by means of a canonical S =¥ -derivation. In (20), =¥ cannot be assumed in the counterfactual mode given (6), and cannot be derived by means of I given that ∉ Γ.

Now, consider the following two factuals:


We symbolize (21) as =¥ ⊃ =¥ and (22) as =¥ ⊃ =¥. Let the IFC-system be exactly like that in the discussion of (13) and (14).

$$(2\text{i})\qquad\frac{\frac{\left[\left[a\triangleq a\right]\right]^{(1)}\quad\left[Fa\right]^{(2\_{\text{i}})}}{\frac{Fa}{F\Gamma}\left(as\to\mathbb{E}\_{0}\right)}\left(\left\|\mathbf{E}\right\|\right)}{\frac{Fb}{Fb}\quad(as\text{I})\qquad\frac{\left[Fb\right]^{(2\_{\text{i}})}}{Fa}\left(as\to\mathbb{E}\_{0}\right)}{\frac{Fa}{a\left\|a\right\|}\left(\left\|\mathbf{I}\right\|,2\_{\text{i}},2\_{2}\right)}\left(as\text{I}\right)$$

$$(24) \qquad \frac{\frac{\left[\left[a\ddot{\mathbf{\ddot{}}}\,b\right]\right]^{(1)} \quad \left[Fa\right]^{(2\_1)}}{Fb} (\ddot{\mathbf{\ddot{}}}\,\mathbf{\ddot{}})}{\frac{\left[\left[Fa\right]\right]}{F\Gamma} \quad a\Gamma} (a\mathbf{J}) \qquad \frac{\left[Fa\right]^{(2\_2)}}{F\Gamma} (a\mathbf{s}\mathbf{\ddot{\mathbf{I}}}\_0)}{\frac{F\Gamma}{a} \quad (a\mathbf{\ddot{I}})} (a\mathbf{s}\mathbf{I})$$
 
$$\frac{a\ddot{\mathbf{\ddot{}}}a}{a\ddot{\mathbf{\ddot{}}}\,b\\_{\mathbf{\ddot{}}}\,a\mathbf{\ddot{}}\,a} (\mathbf{\left[\mathbf{\ddot{}}\,b\right]}\,\mathbf{\dot{}}\,\mathbf{\dot{}})\,\mathbf{l}}{\frac{a\ddot{\mathbf{\ddot{}}}\,a}{a\ddot{\mathbf{\ddot{}}}\,b\\_{\mathbf{\ddot{}}}\,a\mathbf{\ddot{}}\,a} (\mathbf{\left[\mathbf{\ddot{}}\,b\right]}\,\mathbf{\dot{}}\,\mathbf{l})}$$

None of the conclusions of (19), (20), (23), and (24) is a theorem.

#### **5 Concluding remarks**

We have defned rudimentary modal natural deduction systems for reasoning with relatively simple *would*-counterfactuals and causal *since*-subordinator sentences. The systems are motivated by inferential practice. They allow for diferent modes of making assumptions relative to their reference proof systems which serve to determine the factuality status of the formulae that are to be assumed.

Normalization and the subexpression/subformula property have been established

for these systems along largely familiar lines. Due to the subexpression/subformula result, the systems are fully analytic. Due to the normalization result, the systems admit a proof-theoretic semantics. The proposed proof-theoretic semantics is acceptable from an intuitionistic point of view, since it is defned in terms of canonical derivations. Moreover, it is semantically autarkic, since the modal natural deduction systems do not draw on a formal semantics of a diferent kind (e.g., by internalizing a possible worlds similarity semantics).

Some aspects of the proposal are philosophically signifcant. Due to an absence of a semantic ontology (e.g., possible or impossible worlds), neither metaphysical nor epistemological considerations concerning such entities are triggered. Furthermore, an approach to formal epistemology is supported, according to which we arrive at knowledge of counterfactuals and factuals by means of constructive derivation and proof.

It is hoped that useful and less rudimentary modal proof systems can be obtained along the lines suggested in this outline.

**Acknowledgements** I would like to thank the reviewers of earlier versions of the material included in this contribution for valuable comments. I also thank the audiences at the Universities of Tübingen (Third Tübingen Conference on Proof-Theoretic Semantics, 2019) and Frankfurt am Main (LoSe colloquium, 2019) where versions of this content have been presented for useful feedback. Support by the DFG (Project: Proof-theoretic foundations of intensional semantics; Grants: WI 3456/4-1 and WI 3456/4-2) is gratefully acknowledged.

#### **References**


Schnieder, B. (2011). A logic for 'because'. *The Review of Symbolic Logic* 4, 445–465. Schroeder-Heister, P. (2004). On the notion of assumption in logical systems. In:

*Selected Papers Contributed to the Sections of GAP5 (Fifth International Congress*

*of the Society for Analytical Philosophy, Bielefeld, 22-26 September 2003)*. Ed. by R. Bluhm and C. Nimtz. Paderborn: Mentis, 27–48.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Some Set-Theoretic Reduction Principles**

Michael Bärtschi and Gerhard Jäger

**Abstract** In this article we study several reduction principles in the context of Simpson's set theory ATR 0 and Kripke-Platek set theory KP (with infnity). Since ATR 0 is the set-theoretic version of ATR<sup>0</sup> there is a direct link to second order arithmetic and the results for reductions over ATR 0 are as expected and more or less straightforward. However, over KP we obtain several interesting new results and are lead to some open questions.

*Dedicated to Peter Schroeder-Heister*

#### **1 Introduction**

Peter Schroeder-Heister has been interested in the foundations of inference for many decades, most prominently in connection with a program that he baptized "prooftheoretic semantics". Though not related to this program in the strict sense, the work presented here is in direct connection to a talk given by the second author at the *Third Tübingen Conference on Proof-Theoretic Semantics* in 2019. Proof theory is the conceptual link between foundational questions considered under the heading of proof-theoretic semantics and our research on subsystems of second order arithmetic and set theory.

About terminology: Let K be a class of formulas of second order arithmetic. What Simpson calls the K separation principle in second order arithmetic is the collection of all

¬∃([] ∧ []) → ∃∀([] → ∈ → ¬[]),

Michael Bärtschi

Gerhard Jäger Institute of Computer Science, University of Bern, Switzerland, e-mail: gerhard.jaeger@inf.unibe.ch

Institute of Computer Science, University of Bern, Switzerland, e-mail: michael.baertschi@inf.unibe.ch

where [] and [] are formulas from K. In Simpson (2009) various such separation principles have been studied. They play an interesting role in reverse mathematics and are equivalent — over a weak base theory — to certain comprehension principles. In particular, it is shown in Simpson (2009) that ACA<sup>0</sup> plus Σ 1 1 separation (Σ 1 1 -Sep) is equivalent to the famous theory ATR<sup>0</sup> of arithmetical transfnite recursion.

However, this form of separation must not be confused with separation in set theory. There, K separation for a class K of formulas of set theory consists of all assertions

$$\forall x \exists \mathbf{y} (\mathbf{y} = \{z \in x : \varphi[z]\})$$

with [] ranging over K. In order to avoid this confict of notation we decided to call "reduction" what Simpson calls "separation"; see Defnition 2.3. Thus we can use the same terminology in second order arithmetic and set theory.

In this article we consider several reduction principles, analog to Simpson's separation principles, though in the context of his set theory ATR 0 and in the context of Kripke-Platek set theory KP (with infnity). Since ATR 0 is the set-theoretic version of ATR<sup>0</sup> the results for reductions over ATR 0 are as expected and more or less straightforward. However, over KP we obtain several interesting results and are lead to some open questions.

This article begins with a review of separation principles in second order arithmetic — now, of course, under the new term "reduction principles" — and some important equivalences to comprehension principles. Then we have a section in which some basics about the theory ATR 0 and its extension ATR are presented before we address Kripke-Platek set theory KP and its relationship to ATR 0 . Finally, we turn to Σ<sup>1</sup> reduction (Σ1-Red) and Π<sup>1</sup> reduction (Π1-Red). The respective strengths of (Σ1-Red) and (Π1-Red) over ATR 0 and ATR are then determined, in general by making use of the quantifer theorem and the fact that ATR 0 is equivalent to ATR0. Afterwards, we change the environment and study (Σ1-Red) and (Π1-Red) in the context of KP. We end with some general comments and open problems. This paper is a mix of a survey article and new technical work.

#### **2 Well-known reduction principles in second order arithmetic**

Let L<sup>2</sup> be a standard language of second order arithmetic with countably infnite supplies of two distinct sorts of variables; we also have the constant symbols 0 and 1 and function symbols for addition and multiplication plus relation symbols for the equality and less relation on the natural numbers. The frst order variables are called *number variables* and supposed to range over natural numbers. The second order variables are known as *set variables* and intended to range over all sets of natural numbers. The number terms and formulas of L<sup>2</sup> are built up as usual. See, for example, Simpson (2009).

We use the following categories of letters (possibly with subscripts) as metavariables:

Some Set-Theoretic Reduction Principles 427


A formula of L<sup>2</sup> without bound set variables is called *arithmetical*. For 1 ≤ ∈ ℕ, a formula is said to be Σ 1 or Π 1 if it is of the form

$$
\exists X\_1 \forall X\_2 \dots X\_n \theta \quad \text{or} \quad \forall X\_1 \exists X\_2 \dots X\_n \theta,
$$

respectively, where is arithmetical.

Throughout this paper we work in classical logic with equality for the frst sort. Equality for sets in L<sup>2</sup> is defned by saying that two sets are identical if they contain the same elements.

ACA<sup>0</sup> is the system of second order arithmetic whose non-logical axioms comprise the defning axioms for all primitive recursive functions and relations, the axiom schema of *arithmetical comprehension*

$$\exists X \forall i (i \in X \leftrightarrow \varphi(i]).$$

for all arithmetical formulas [], and the *induction axiom*

$$
\forall X (0 \in X \land \forall i (i \in X \to i+1 \in X) \to \forall i (i \in X)).
$$

ACA<sup>0</sup> is known to be a conservative extension of Peano arithmetic PA. The theory ACA is obtained from ACA<sup>0</sup> by adding the *schema of induction*

$$
\varphi[0] \land \forall i (\varphi[i] \to \varphi[i+1]) \to \forall i \varphi[i],
$$

for all L<sup>2</sup> formulas []. Below we will make use of several further axiom schemas:

1. (Σ 1 1 -AC) is the schema

$$\forall i \exists X \varphi \{i, X\} \to \exists Y \forall i \varphi \{i, (Y)\_i\}$$

for arbitrary Σ 1 1 formulas [, ];

2. (Δ 1 2 -CA) is the schema

$$\forall i (\varphi[i] \leftrightarrow \psi\{i\}) \rightarrow \exists X \forall i (i \in X \leftrightarrow \varphi\{i\})$$

for all Σ 1 2 formulas [] and Π 1 2 formulas [];

3. (Π 1 2 -CA) is the schema

$$\exists X \forall i (i \in X \leftrightarrow \varphi[i])$$

for all Π 1 2 formulas [].

To simplify the notation we write Δ 1 2 -CA<sup>0</sup> for the theory ACA<sup>0</sup> + (Δ 1 2 -CA) and Π 1 2 -CA<sup>0</sup> for ACA<sup>0</sup> + (Π 1 2 -CA).

Additional notation is necessary to formulate the principles of *arithmetical*

*transfnite recursion* (ATR) and *bar induction* (BI), and to this end we follow Simpson (2009) as closely as possible.

Working in ACA0, we code binary relations on the natural numbers ℕ as subsets of ℕ via the pairing function

$$(i,j) := \left(i+j\right)^2 + i.$$

A set of natural numbers is said to be *refexive* if

$$\forall i, j \left( (i, j) \in X \to ((i, i) \in X \land (j, j) \in X) \right).$$

If is refexive, then *Field*[] is defned to be the set { : (, ) ∈ }, and we write

$$\begin{aligned} i \leq\_X j &:= (i, j) \in X, \\ i <\_X j &:= (i, j) \in X \wedge (j, i) \notin X. \end{aligned}$$

Furthermore, if is refexive we say that is *well-founded* if every non-empty subset of *Field*[] has an -minimal element.1 We say that X is a *linear ordering* if it is a refexive linear ordering of its feld, i.e.,

$$\forall i, j, k \left( i \leq\_X j \land j \leq\_X k \rightarrow i \leq\_X k \right),$$

$$\forall i, j \left( i \leq\_X j \land j \leq\_X i \rightarrow i = j \right),$$

$$(\forall i, j \inField([X]) (i \leq\_X j \lor j \leq\_X i).$$

We say that is a *well-ordering* if it is both well-founded and a linear ordering. Let *WF*[], *LO*[], and *WO*[] be formulas saying that is, respectively, well-founded, a linear ordering, and a well-ordering.

**Defnition 2.1** Given an L<sup>2</sup> formula [], let *TI*[, ] be the formula

$$\forall j ((\forall i <\_X j) \varphi[i] \to \varphi[j]) \to \forall j \varphi[j] .$$

The schema (BI) of *bar induction* consists of all formulas

$$\forall X (WF[X] \to TI[X,\varphi]),$$

where ranges over all L<sup>2</sup> formulas.

Now let [, ] be any formula with distinguished free number variable and distinguished free set variable . Defne H [, ] to be the formula

$$LO\{X\} \land Y = \{ (i, j) : j \in Field\{X\} \land \varphi\{i, Y^j\} \},$$

where := {(, ) ∈ : < }. Intuitively, H [, ] says that is a linear ordering and is the result of iterating along .

<sup>1</sup> This is equivalent over ACA<sup>0</sup> to the defnition given in Simpson (2009).

**Defnition 2.2** The schema (ATR) of *arithmetical transfnite recursion* comprises

$$\forall X (WO[X] \to \exists Y \mathcal{H}\_{\varphi}[X, Y])$$

for all arithmetical formulas [, ]. Accordingly, we set

$$\mathsf{ATR}\_0 \coloneqq \mathsf{ACA}\_0 + (\mathsf{ATR}) \quad \text{and} \quad \mathsf{ATR} \coloneqq \mathsf{ACA} + (\mathsf{ATR}).$$

ACA<sup>0</sup> and ATR<sup>0</sup> belong to the "big fve" in the Friedman-Simpson program of *reverse mathematics*:

$$\mathsf{RCA}\_{0} \subsetneq \mathsf{WK}\mathsf{L}\_{0} \subsetneq \mathsf{ACA}\_{0} \subsetneq \mathsf{ATR}\_{0} \subsetneq \mathsf{TI}\_{1}^{1}\mathsf{CA}\_{0}.$$

For more about these theories and the program of reverse mathematics in general we refer to Simpson (2009).

It is also known that the proof-theoretic ordinals of ATR<sup>0</sup> and ATR are the ordinals Γ<sup>0</sup> and Γ<sup>0</sup> , respectively. For these results cf., for example, Friedman, McAloon and Simpson (1982) and Jäger (1980; 1984).

In the following the theory ATR<sup>0</sup> will play a major role. Arithmetical transfnite recursion is relevant here because of its remarkable equivalence to Π 1 1 reduction over ACA0. This and related reduction principles are introduced now.

**Defnition 2.3** Let be a natural number greater than 0.

1. Σ 1 *reduction* (Σ 1 -Red) is the schema consisting of all formulas

$$\forall i (\varphi[i] \to \psi\{i\}) \to \exists X \forall i (\varphi\{i\} \to i \in X \to \psi\{i\}),$$

where [] is a Π 1 and [] a Σ 1 formula.

2. Π 1 *reduction* (Π 1 -Red) is the schema consisting of all formulas

$$\forall i (\varphi[i] \to \psi\{i\}) \to \exists X \forall i (\varphi\{i\} \to i \in X \to \psi\{i\}),$$

where [] is a Σ 1 and [] a Π 1 formula.

As mentioned in the introduction, what we call Σ 1 reduction [Π 1 reduction] has been called Π 1 separation [Σ 1 separation] by Simpson. Modulo this renaming the following characterizations are known.

#### **Theorem 2.4 (Buchholz-Schütte, Simpson)**

*(i) The theory* ACA<sup>0</sup> + (Σ 1 1 *-*AC) *proves* (Σ 1 1 *-*Red)*. (ii)* ACA<sup>0</sup> + (Π 1 1 *-*Red) *is equivalent to* ATR0*. (iii)* ACA<sup>0</sup> + (Σ 1 2 *-*Red) *is equivalent to* Δ 1 2 *-*CA0*. (iv)* ACA<sup>0</sup> + (Π 1 2 *-*Red) *is equivalent to* Π 1 2 *-*CA0*.*

Assertion (i) is an easy observation; (ii) is a fairly complicated and technically demanding result presented in Simpson (2009); (iii) has been proved in Buchholz and

Schütte (1988); (iv) is mentioned in Simpson (2009) as an exercise. The following theorem is an immediate consequence of Simpson (2009), Corollary VII.2.19.

**Theorem 2.5 (Simpson)** *The theory* ACA<sup>0</sup> + (BI) *proves all instances of* (ATR)*.*

#### **3 Basic set theory BS<sup>0</sup> and Simpson's ATR 0**

After these introductory remarks we now turn to the two set theories that are in the center of this work: Simpson's ATR and Kripke-Platek set theory KP (with infnity).

0 The set-theoretic language L<sup>∈</sup> is a one-sorted frst order language with two binary relation symbols ∈ and =, countably many set variables, and the usual connectives and quantifers of frst order logic. Terms and formulas of L<sup>∈</sup> are as usual.

We shall make use of the common set-theoretic terminology and employ the standard notational conventions. In addition, we use as metavariables (possibly with subscripts):

1. , , , , , , , , , , , , , for set-theoretic variables,

2. , , , for formulas.

We also follow the general conventions in defning the Δ0, Σ, Π, Σ, and Π formulas of L<sup>∈</sup> (1 ≤ ∈ ℕ).

*Basic set theory* BS<sup>0</sup> is a theory in the language L<sup>∈</sup> — based on classical frst order logic with equality — whose non-logical axioms are the universal closures of the following formulas:


It is set-theoretic folklore and mentioned, for example, in Simpson (2009) that BS<sup>0</sup> proves Δ<sup>0</sup> separation.

**Lemma 3.1** BS<sup>0</sup> *proves for all* Δ<sup>0</sup> *formulas* [] *and all sets that*

$$(\Delta\_0 \mathsf{Sep}) \qquad\qquad\qquad\exists \mathsf{y} \forall \mathsf{x} (\mathsf{x} \in \mathsf{y} \leftrightarrow \mathsf{x} \in a \land \varphi[\mathsf{x}]) .$$

The theory BS is the extension of BS<sup>0</sup> resulting from extending regularity from sets to arbitrary formulas [],

$$
\exists \mathbf{x} \varphi[\mathbf{x}] \to \exists \mathbf{x} (\varphi[\mathbf{x}] \land (\forall \mathbf{y} \in \mathbf{x}) \neg \varphi[\mathbf{y}]) .
$$

Thus ∈-induction is available in BS for arbitray L<sup>∈</sup> formulas.

**Lemma 3.2** BS *proves, for any* L<sup>∈</sup> *formula* []*,*

$$(\mathcal{L}\_{\in} \mathsf{I}\_{\in}) \qquad \qquad \forall \mathbf{x} ((\forall \mathbf{y} \in \mathsf{x}) \varphi[\mathbf{y}] \to \varphi[\mathbf{x}]) \to \forall \mathbf{x} \varphi[\mathbf{x}] .$$

*Proof* Aiming at the contrapositive, assume ¬[] for some and let be a transitive set such that {} ⊆ . Now set

$$
\psi[x] \coloneqq x \in b \land \neg \varphi[x] \,.
$$

Clearly, ∃[] and, therefore, the schema of regularity for formulas gives us an satisfying

$$
\psi\{\mathbf{x}\} \land \ (\forall \mathbf{y} \in \mathbf{x}) \neg \psi\{\mathbf{y}\},
$$

i.e.,

$$(\mathbf{x} \in b \land \neg \varphi \{\mathbf{x}\} \land (\forall \mathbf{y} \in \mathbf{x}) (\mathbf{y} \notin b \lor \varphi \{\mathbf{y}\} ).$$

Since is transitive, this can be simplifed to

$$
\neg \varphi[x] \land (\forall y \in x) \varphi[y],
$$

and we have the desired statement. □

Moreover, we shall employ the common set-theoretic terminology and the standard notational conventions, for example:

$$\operatorname{Tran}[a] := (\forall \mathbf{x} \in a)(\forall \mathbf{y} \in \mathbf{x})(\mathbf{y} \in a),$$

$$\operatorname{Ord}[a] := \operatorname{Tran}[a] \wedge (\forall \mathbf{x} \in a) \operatorname{Tran}[\mathbf{x}],$$

$$\operatorname{Succ}[a] := \operatorname{Ord}[a] \wedge (\exists \mathbf{x} \in a)(a = \mathbf{x} \cup \{\mathbf{x}\}),$$

$$\operatorname{Fin} \operatorname{Ord}[a] := \begin{cases} \operatorname{Ord}[a] \wedge (a = \oslash \lor \operatorname{Succ}[a]) \wedge \\ (\forall \mathbf{x} \in a)(\mathbf{x} = \oslash \lor \operatorname{Succ}[\mathbf{x}]). \end{cases}$$

In addition, we let be the collection of all fnite ordinals and observe that it forms a set in BS0.

There is a natural translation of L<sup>2</sup> into L∈: The number variables of L<sup>2</sup> are interpreted in L<sup>∈</sup> as ranging over and the set variables of L<sup>2</sup> are interpreted in L<sup>∈</sup> as ranging over the subsets of . This means that


Then one has to verify that BS<sup>0</sup> proves the existence of set-theoretic functions on that correspond to the number-theoretic addition and multiplication. The number-theoretic less and equality relation go over into < and = on .

Clearly, each axiom of ACA<sup>0</sup> becomes a theorem of BS<sup>0</sup> under this translation. When working in L<sup>∈</sup> we shall from now on identify L<sup>2</sup> formulas with their translations into L∈.

Simpson's ATR 0 is obtained from BS<sup>0</sup> by adding the axiom of countability (C) and the axiom (Beta). In the defnition below we write *Inj*[ , , ] to state that is an injective function from to .

**Defnition 3.3** A set is called *hereditarily countable* if there exist a transitive superset of and an injection from to ,

$$HC[a] \coloneqq \exists \mathfrak{x}, f(a \subseteq \mathfrak{x} \land \text{Tran}[\mathfrak{x}] \land \text{I} \, \text{j}[f, \mathfrak{x}, \omega]).$$

The *axiom of countability* (C) claims that all sets are hereditarily countable,

$$(\mathsf{C}) \coloneqq \forall x HC[x].$$

In the formulation of the axiom (Beta) we write *Dom*[ , ] to express that is a function with domain . We write ⟨, ⟩ for the ordered pair of and and × for the Cartesian product of and . Also, ⊆ × is called *well-founded on* if every non-empty subset of has an -minimal element,

$$\mathcal{W}f[a,r] \coloneqq (\forall b \subseteq a)(b \neq \mathcal{Q} \to (\exists x \in b)(\forall y \in b)(\langle \mathbf{y}, \mathbf{x} \rangle \notin r)).$$

**Defnition 3.4** The *axiom* (Beta) is the universal closure of the formula

$$\mathcal{W}f[a,r] \to \exists f(Dom[f,a] \land (\forall \mathbf{x} \in a)(f(\mathbf{x}) = \{f(\mathbf{y}) : \mathbf{y} \in a \land \langle \mathbf{y}, \mathbf{x} \rangle \in r\})).$$

This function is said to be the *collapsing function* for on .

The axiom (Beta) has the efect of making the Π<sup>1</sup> predicate *Wf* [, ] a Δ<sup>1</sup> predicate since the existence of a collapsing function for on obviously implies the well-foundedness of on .

With these defnitions the stage is set to introduce the theory ATR 0 and its extension ATR :

ATR 0 :<sup>=</sup> BS<sup>0</sup> + (Beta) + (C) and ATR := BS + (Beta) + (C).

Below we write ATR 0 \(C) for the subsystem of ATR <sup>0</sup> without the axiom of countability and ATR \(C) for ATR without (C).

As we have mentioned above, there is a natural translation of L<sup>2</sup> into L∈. The converse direction, i.e., the translation of L<sup>∈</sup> into L2, is more complicated. In a nutshell: The sets of L<sup>∈</sup> are represented by so-called *suitable trees*. Any suitable tree is a well-founded subset of ℕ<ℕ, and if ⟨⟩ ∈ then ⟨⟩ is the subtree { : ⟨⟩ ∗ ∈ } of , with ranging over (the codes of) fnite sequences. Elementhood is then coded by defning

$$S \in \prescript{\*}{}{T} := \exists n(\langle n \rangle \in T \land S \simeq T^{\langle n \rangle}),$$

where ≃ ⟨⟩ says that there exists a specifc tree isomorphism between and ⟨⟩ . For all details concerning this representation we refer to Simpson (2009). There it is also described how to each formula of L<sup>∈</sup> we associate a formula || of L2.

The following two results of Simpson (2009) make it clear that ATR 0 is the set-theoretic variant of ATR0.

#### **Theorem 3.5 (Simpson)**

*(i) Every axiom of* ATR<sup>0</sup> *is a theorem of* ATR 0 *. (ii) If is an axiom of* ATR 0 *, then* || *is a theorem of* ATR0*.*

This theorem can be easily extendedtoATR . On the side of second order arithmetic, arithmetical transfnite recursion simply has to be replaced by bar induction.

#### **Theorem 3.6**

*(i) Every axiom of* ACA<sup>0</sup> + (BI) *is a theorem of* ATR \(C)*. (ii) If is an axiom of* ATR *, then* || *is a theorem of* ACA<sup>0</sup> + (BI)*.*

*Proof* It is a classic result that all instances of (BI) can be proved by means of (Beta) and ∈-induction (L∈-I∈); see, for example, Jäger (1986). For (ii) we refer to Simpson (2009), Theorem VII.3.34 and Exercise VII.3.38. □

**Corollary 3.7** *If we write* || *for the proof-theoretic ordinal of the theory , then we have:*

*(i)* <sup>|</sup>ATR0<sup>|</sup> <sup>=</sup> <sup>|</sup>ATR 0 | = Γ0*. (ii)* |ATR| = Γ<sup>0</sup> *. (iii)* |ATR | = Ψ(Ω+1) *(Bachmann-Howard ordinal).*

We end this introductory section by stating the so-called *quantifer theorem* which relates formulas of L<sup>2</sup> with formulas of L∈. The formulation below is from Simpson (2009). Its frst part also follows from corresponding results in Jäger (1979; 1986).

**Theorem 3.8 (Quantifer theorem; Jäger, Simpson)** *Let be any natural number.*


In many set theories — Kripke-Platek set theory is a typical example — we do not have to make a big diference between Δ<sup>0</sup> and Δ<sup>1</sup> formulas. That this is not the case for ATR 0 is an immediate consequence of the quantifer theorem. Recall from Lemma 3.1 that ATR 0 proves Δ<sup>0</sup> separation. However:

#### **Corollary 3.9** *There are instances of* <sup>Δ</sup><sup>1</sup> *separation that are not provable in* ATR *.*

*Proof* In view of the quantifer theorem, ATR + (Δ1-Sep) comprises the theory Δ 1 2 -CA0, whose proof-theoretic ordinal is much greater than the Bachmann-Howard ordinal. Therefore, (Δ 1 2 -CA) cannot be provable in ATR . □

#### **4 Kripke-Platek set theory KP and its relationship to ATR 0**

Kripke-Platek set theory KP (with infnity) is one of the best studied subsystems of Zermelo-Fraenkel set theory ZF. The transitive models of KP are called *admissible sets*, and 1 , where 1 denotes the frst non-recursive ordinal, is the least standard model of KP. Kripke-Platek set theory and admissible sets play an important role in generalized recursion theory, defnability theory and, of course, in proof theory.

Kripke-Platek set theory KP is obtained from BS by adding the schema of Δ<sup>0</sup> collection, i.e.,

$$(\Delta\_0 \mathsf{Col}) \qquad \qquad (\forall \mathsf{x} \in a) \exists \mathsf{y} \varphi[\mathsf{x}, \mathsf{y}] \to \exists \mathsf{z} (\forall \mathsf{x} \in a) (\exists \mathsf{y} \in \mathsf{z}) \varphi[\mathsf{x}, \mathsf{y}]$$

for all Δ<sup>0</sup> formulas [, ];

$$\mathsf{KP} \coloneqq \mathsf{BS} + (\Delta\_0 \mathsf{-Col}) \cdot \mathsf{z}$$

KP<sup>0</sup> is the subsystem of KP where regularity is restricted to sets (as in BS0), i.e.,

$$\mathsf{KP}\_0 \coloneqq \mathsf{BS}\_0 + (\Delta\_0 \mathsf{CoI})\_\cdot$$

Thus, obviously, KP<sup>0</sup> + (Beta) and KP + (Beta) are the same theories as ATR 0 \(C) + (Δ0-Col) and ATR \(C) + (Δ0-Col), respectively.

Below we list a series of known result that indicate that the relationship between ATR 0 and KP is quite intricate. They are mostly taken from Simpson (1980; 2009) and Jäger (1982; 1986).

#### **Theorem 4.1 (Overview)**


From that we immediately deduce that ATR 0 and KP are not compatible in the sense that

KP ⊈ ATR 0 and ATR <sup>0</sup> <sup>⊈</sup> KP.

We even have KP<sup>0</sup> <sup>⊈</sup> ATR . Otherwise, ATR would comprise KP + (Beta), and the ordinal of this theory is greater than the Bachmann-Howard ordinal.

<sup>2</sup> Traditionally, KP is defned to be the set theory whose non-logical axioms consist of Extensionality, Pairing, Union, Infnity, ( L<sup>∈</sup> -I<sup>∈</sup> ), (Δ0-Sep), and (Δ0-Col); but both defnitions are equivalent.

#### **5 Reduction axioms in set theory**

The next step is to add reduction principles, similar to those of second order arithmetic in Defnition 2.3, to our set theories.

#### **Defnition 5.1**

1. Σ<sup>1</sup> *reduction* (Σ1-Red) is the schema consisting of all formulas

$$(\forall \mathbf{x} \in a)(\varphi[\mathbf{x}] \to \psi[\mathbf{x}]) \to (\exists \mathbf{y} \subseteq a)(\forall \mathbf{x} \in a)(\varphi[\mathbf{x}] \to \mathbf{x} \in \mathbf{y} \to \psi[\mathbf{x}]),$$

where [] is a Π<sup>1</sup> and [] a Σ<sup>1</sup> formula.

2. Π<sup>1</sup> *reduction* (Π1-Red) is the schema consisting of all formulas

$$(\forall \mathbf{x} \in a)(\varphi[\mathbf{x}] \to \psi[\mathbf{x}]) \to (\exists \mathbf{y} \subseteq a)(\forall \mathbf{x} \in a)(\varphi[\mathbf{x}] \to \mathbf{x} \in \mathbf{y} \to \psi[\mathbf{x}]),$$

where [] is a Σ<sup>1</sup> and [] a Π<sup>1</sup> formula.

Our aim is to analyze the strengths of (Σ1-Red) and (Π1-Red) in the context of Simpson's ATR 0 and Kripke-Platek set theory. We begin with ATR 0 and ATR where the situation is clear.

#### **5.1 ATR 0 and ATR plus** (**1-Red**) **and** (**1-Red**)

Theorem 3.5 and Theorem 3.6, in combination with the quantifer theorem, immediately give us

$$\mathsf{ATR}\_{0} + (\Sigma^{1}\_{2}\mathsf{-Red}) \subseteq \mathsf{ATR}^{S}\_{0} + (\Sigma\_{1}\mathsf{-Red}),$$

$$\mathsf{ACA}\_{0} + (\mathsf{Bl}) + (\Sigma^{1}\_{2}\mathsf{-Red}) \subseteq \mathsf{ATR}^{S} + (\Sigma\_{1}\mathsf{-Red}),$$

$$\mathsf{ATR}\_{0} + (\Pi^{1}\_{2}\mathsf{-Red}) \subseteq \mathsf{ATR}^{S}\_{0} + (\Pi\_{1}\mathsf{-Red}),$$

$$\mathsf{ACA}\_{0} + (\mathsf{Bl}) + (\Pi^{1}\_{2}\mathsf{-Red}) \subseteq \mathsf{ATR}^{S} + (\Pi\_{1}\mathsf{-Red}).$$

Therefore, together with Theorem 2.4 we have the following lower bound results for (Σ1-Red) and (Π1-Red).

#### **Theorem 5.2**

*(i)* Δ 1 2 *-*CA<sup>0</sup> ⊆ ACA<sup>0</sup> + (Σ 1 2 *-*Red) ⊆ ATR 0 + (Σ1*-*Red)*. (ii)* Δ 1 2 *-*CA<sup>0</sup> + (BI) ⊆ ACA<sup>0</sup> + (Σ 1 2 *-*Red) + (BI) ⊆ ATR + (Σ1*-*Red)*. (iii)* Π 1 2 *-*CA<sup>0</sup> ⊆ ACA<sup>0</sup> + (Π 1 2 *-*Red) ⊆ ATR 0 + (Π1*-*Red)*. (iv)* Π 1 2 *-*CA<sup>0</sup> + (BI) ⊆ ACA<sup>0</sup> + (Π 1 2 *-*Red) + (BI) ⊆ ATR + (Π1*-*Red)*.*

Turning to the converse directions, we frst make use of the quantifer theorem again and observe that for every (closed) instance of (Σ1-Red) and (Π1-Red) the corresponding L<sup>2</sup> formula || is derivable in ATR<sup>0</sup> + (Σ 1 2 -Red) and ATR<sup>0</sup> + (Π 1 2 -Red), respectively. Therefore, Theorem 3.5 and Theorem 3.6 yield for all sentences of L∈:

$$\begin{aligned} \mathsf{ATR}^{S}\_{0} + (\Sigma\_{1} \mathsf{-Red}) &\vdash \varphi \implies \mathsf{ATR}\_{0} + (\Sigma^{1}\_{2} \mathsf{-Red}) \vdash |\varphi|, \\ \mathsf{ATR}^{S} + (\Sigma\_{1} \mathsf{-Red}) &\vdash \varphi \implies \mathsf{ACA}\_{0} + (\Sigma^{1}\_{2} \mathsf{-Red}) + (\mathsf{Bl}) \vdash |\varphi|, \\ \mathsf{ATR}^{S}\_{0} + (\Pi\_{1} \mathsf{-Red}) &\vdash \varphi \implies \mathsf{ATR}\_{0} + (\Pi^{1}\_{2} \mathsf{-Red}) \vdash |\varphi|, \\ \mathsf{ATR}^{S} + (\Pi\_{1} \mathsf{-Red}) &\vdash \varphi \implies \mathsf{ACA}\_{0} + (\Pi^{1}\_{2} \mathsf{-Red}) + (\mathsf{Bl}) \vdash |\varphi|. \end{aligned}$$

The following upper bounds for (Σ1-Red) and (Π1-Red) are straightforward consequences of Theorem 2.4.

**Theorem 5.3** *We have for every sentence of* L∈*:*

*(i)* ATR 0 + (Σ1*-*Red) ⊢ =⇒ Δ 1 2 *-*CA<sup>0</sup> ⊢ ||*. (ii)* ATR + (Σ1*-*Red) <sup>⊢</sup> <sup>=</sup><sup>⇒</sup> <sup>Δ</sup> 1 2 *-*CA<sup>0</sup> + (BI) ⊢ ||*. (iii)* ATR 0 + (Π1*-*Red) ⊢ =⇒ Π 1 2 *-*CA<sup>0</sup> ⊢ ||*. (iv)* ATR + (Π1*-*Red) <sup>⊢</sup> <sup>=</sup><sup>⇒</sup> <sup>Π</sup> 1 2 *-*CA<sup>0</sup> + (BI) ⊢ ||*.*

To sum up, we know that (Σ1-Red) and (Π1-Red) added to ATR 0 and ATR lead to the following proof-theoretic equivalences:

$$\begin{aligned} \mathsf{ATR}^{S}\_{0} + (\Sigma\_{1} \mathsf{-Red}) &\equiv \Delta^{1}\_{2} \mathsf{-CA}\_{0}, \\ \mathsf{ATR}^{S} + (\Sigma\_{1} \mathsf{-Red}) &\equiv \Delta^{1}\_{2} \mathsf{-CA}\_{0} + (\mathsf{Bl}), \\ \mathsf{ATR}^{S}\_{0} + (\Pi\_{1} \mathsf{-Red}) &\equiv \Pi^{1}\_{2} \mathsf{-CA}\_{0}, \\ \mathsf{ATR}^{S} + (\Pi\_{1} \mathsf{-Red}) &\equiv \Pi^{1}\_{2} \mathsf{-CA}\_{0} + (\mathsf{Bl}). \end{aligned}$$

So for (Σ1-Red) and (Π1-Red) the situation is clear as long as we stay in the context of ATR 0 and its extension ATR . The picture is completely diferent when we move to Kripke-Platek set theory.

#### **5.2 KP<sup>0</sup> and KP plus** (**1-Red**) **and** (**1-Red**)

A frst observation is that (Σ1-Red) is irrelevant for Kripke-Platek set theory; it is provable there.

**Lemma 5.4** *Every instance of* (Σ1*-*Red) *is provable in* KP0*.*

*Proof* Suppose that, for some Π<sup>1</sup> formula [] and Σ<sup>1</sup> formula [],

$$(\forall x \in a)(\varphi[x] \to \psi[x])...$$

By Σ refection there exists a set such that

$$(\forall x \in a)(\varphi^b[x] \to \psi^b[x]).$$

Then := { ∈ : []} is the set we need. □

A further obvious observation is that an upper bound for (Π1-Red) is provided by Σ<sup>1</sup> separation, i.e., the schema

$$(\Sigma\_1 \mathsf{Sep}) \qquad\qquad\qquad\exists \mathsf{y} \forall \mathsf{x} (\mathsf{x} \in \mathsf{y} \iff \mathsf{x} \in a \land \varphi[\mathsf{x}])$$

for all Σ<sup>1</sup> formulas []. To see why, take a Σ<sup>1</sup> formula [] and a Π<sup>1</sup> formula [] such that

$$(\forall x \in a)(\varphi[x] \to \psi[x])$$

for some set . By (Σ1-Sep) we can defne the set := { ∈ : []} which is an obvious witness for (Σ1-Red). Thus we have the following upper bounds.

#### **Theorem 5.5**

*(i)* KP<sup>0</sup> + (Π1*-*Red) ⊆ KP<sup>0</sup> + (Σ1*-*Sep)*. (ii)* KP + (Π1*-*Red) ⊆ KP + (Σ1*-*Sep)*.*

It is still an open question whether these bounds are sharp. However, we have some partial results.

**Theorem 5.6** *We have for all formulas of* L2*:*

*(i)* Π 1 2 *-*CA<sup>0</sup> ⊢ =⇒ KP<sup>0</sup> + (Beta) + (Π1*-*Red) ⊢ *. (ii)* Π 1 2 *-*CA<sup>0</sup> + (BI) ⊢ =⇒ KP + (Beta) + (Π1*-*Red) ⊢ *.*

*Proof* We frst recall from Theorem 2.4 that Π 1 2 -CA<sup>0</sup> is equivalent to ACA0+(Π 1 2 -Red). From the frst part of the quantifer theorem we deduce that, within KP<sup>0</sup> + (Beta), every Σ 1 2 formula of L<sup>2</sup> is equivalent to a Σ<sup>1</sup> formula and every Π 1 2 formula of L<sup>2</sup> is equivalent to a Π<sup>1</sup> formula. Therefore, every instance of (Π 1 2 -Red) is provable in KP<sup>0</sup> + (Beta) + (Π1-Red). The rest follows from Theorem 3.6. □

Now we recall from, for example, Rathjen (1999) that KP<sup>0</sup> + (Σ1-Sep) and KP + (Σ1-Sep) prove the same L<sup>2</sup> sentences as Π 1 2 -CA<sup>0</sup> and Π 1 2 -CA<sup>0</sup> + (BI), respectively. Moreover, by following Barwise (1975) (with some small modifcations), we can also show that KP<sup>0</sup> + (Σ1-Sep) proves (Beta). Thus the following assertions are direct consequences of the previous two theorems.

#### **Corollary 5.7**


Shown schematically, we therefore have the following proof-theoretic equivalences:

$$\mathsf{K}\mathsf{P}\_{0} + (\mathsf{Beta}) + (\Pi\_{\mathsf{I}} \mathsf{\cdot} \mathsf{Red}) \equiv \mathsf{K}\mathsf{P}\_{0} + (\Sigma\_{\mathsf{I}} \mathsf{\cdot} \mathsf{Setp}) \equiv \Pi\_{2}^{\mathsf{I}} \mathsf{\cdot} \mathsf{CA}\_{0},$$

$$\mathsf{K}\mathsf{P} + (\mathsf{Beta}) + (\Pi\_{\mathsf{I}} \mathsf{\cdot} \mathsf{Red}) \equiv \mathsf{K}\mathsf{P} + (\Sigma\_{\mathsf{I}} \mathsf{\cdot} \mathsf{Setp}) \equiv \Pi\_{2}^{\mathsf{I}} \mathsf{\cdot} \mathsf{CA}\_{0} + (\mathsf{Bl}).$$

The axiom of constructibility states that every set belongs to some level of Gödel's hierarchy of constructible sets:

$$(V \!\!= \!\!L) \qquad\qquad\qquad\qquad\qquad\forall \mathbf{x} \exists \alpha (\mathbf{x} \in L\_{\alpha}) .$$

In addition, we write ( < ) to express that is smaller than according to the well-order < of the constructible universe. It is well-known that ( ∈ ) and ( < ) are Δ over KP. Moreover, (∃ < ) and (∀ < ) may be treated as bounded quantifers. ( ∈ ) is short for ∃( ∈ ). For more on the constructible universe see, e.g., Barwise (1975) or Kunen (1980).

In Jäger and Steila (2018) an interesting separation principle is introduced and studied. Call a quantifer *subset bounded* if it ranges over the subsets of a given set. Then let ∃ <sup>P</sup> (Δ1) separation be the separation principle that, given any set , allows the introduction of all subsets of defned by a subset bounded Σ<sup>1</sup> formula over a Δ<sup>1</sup> matrix, i.e.,

$$\begin{aligned} (\mathsf{T}^{\mathcal{P}}(\Lambda\_1) \cdot \mathsf{Sep}) \end{aligned} \qquad \begin{aligned} (\forall \mathsf{x} \in a)(\forall \mathsf{y} \subseteq a)(\varphi[\mathsf{x}, \mathsf{y}] \leftrightarrow \psi[\mathsf{x}, \mathsf{y}]) \to \\ (\exists z \subseteq a)(\forall \mathsf{x} \in a)(\mathsf{x} \in z \leftrightarrow (\exists \mathsf{y} \subseteq a)\varphi[\mathsf{x}, \mathsf{y}]), \end{aligned}$$

for all Σ<sup>1</sup> formulas [, ] and Π<sup>1</sup> formulas [, ].

The relationship between (Σ1-Sep) and (∃<sup>P</sup> (Δ1)-Sep) is interesting. It is easy to see that all instances of (∃<sup>P</sup> (Δ1)-Sep) are provable in KP + (Σ1-Sep). For the converse direction see the follwing theorem from Jäger and Steila (2018).

**Theorem 5.8 (Jäger and Steila)** *All instances of* (Σ1*-*Sep) *are provable in* KP + (*V*=*L*) + (∃<sup>P</sup> (Δ1)*-*Sep)*.*

Therefore, in order to show that the theory KP + (*V*=*L*) + (Π1-Red) contains KP + (Σ1-Sep), it is sufcient to prove that it contains (∃<sup>P</sup> (Δ1)-Sep).

**Lemma 5.9** KP + (*V*=*L*) + (Π1*-*Red) *proves all instances of* (∃<sup>P</sup> (Δ1)*-*Sep)*.*

*Proof* Working in KP + (*V*=*L*) + (Π1-Red), let us assume that

$$(\forall x \in a)(\forall y \subseteq a)(\varphi[x, y] \leftrightarrow \psi[x, y])$$

for a Σ<sup>1</sup> formula [, ] and a Π<sup>1</sup> formula [, ]. We defne

$$\widetilde{\varphi}[\mathbf{x}, \mathbf{y}] := \varphi[\mathbf{x}, \mathbf{y}] \land (\forall z \, \mathbf{y})(z \sqsubseteq a \to \neg \forall \, [\mathbf{x}, z]),$$

$$\widetilde{\psi}[\mathbf{x}, \mathbf{y}] := \psi\{\mathbf{x}, \mathbf{y}\} \land (\forall z \, \mathbf{y})(z \sqsubseteq a \to \neg \varphi[\mathbf{x}, z]).$$

Thus e[, ] is a <sup>Σ</sup> formula and e[, ] is a <sup>Π</sup> formula, and we have: (1) (∀ <sup>∈</sup> ) (∀ <sup>⊆</sup> ) (e[, ] ↔ e[, ]).

Some Set-Theoretic Reduction Principles 439

(2) e[, ] → [, ]. (3) <sup>⊆</sup> <sup>∧</sup> [, ] → (∃ <sup>⊆</sup> )e[, ]. (4) (∀, <sup>⊆</sup> ) (e[, ] ∧ e[, ] → <sup>=</sup> ).

We also defne

$$\mathfrak{A}[\mu] := (\exists \boldsymbol{\nu}, \boldsymbol{w} \in a)(\exists \mathbf{y} \subseteq a)(\boldsymbol{u} = \langle \boldsymbol{\nu}, \boldsymbol{w} \rangle \wedge \widetilde{\varphi}[\boldsymbol{\nu}, \mathbf{y}] \wedge \boldsymbol{w} \in \mathbf{y}),$$

$$\mathfrak{B}[\boldsymbol{u}] := (\forall \boldsymbol{\nu}, \boldsymbol{w} \in a)(\forall \mathbf{y} \subseteq a)(\boldsymbol{u} = \langle \boldsymbol{\nu}, \boldsymbol{w} \rangle \wedge \widetilde{\varphi}[\boldsymbol{\nu}, \mathbf{y}] \to \boldsymbol{w} \in \mathbf{y}).$$

Hence [] is equivalent to a Σ<sup>1</sup> formula and [] to a Π<sup>1</sup> formula. In addition, because of (4) we have

$$(\forall u \in a \times a)(\mathfrak{A}[u] \to \mathfrak{B}[u])\,.$$

So by (Π1-Red) there is a ⊆ × such that

$$(\forall \mu \in a \times a) (\mathfrak{A}[\mu] \to \mu \in b \to \mathfrak{B}[\mu]).$$

Claim: For all ∈ ,

$$(\*)\qquad\qquad(\exists \mathbf{y} \subseteq a)\varphi[\mathbf{v}, \mathbf{y}] \leftrightarrow \widetilde{\varphi}[\mathbf{v}, (b)\_{\mathbf{v}}],$$

where () stands for the set { ∈ : ⟨, ⟩ ∈ }.

Proof of the claim. The direction from right to left is obvious. To prove the converse, assume (∃ <sup>⊆</sup> )[, ]. Then there is a <sup>⊆</sup> with e[, ] according to (3), and for this we have = (). This can be seen as follows: For all ∈ ,

$$w \in c \to \mathfrak{A}\left[\left\langle \nu, w \right\rangle\right] \to \left\langle \nu, w \right\rangle \in b \to w \in (b)\_{\mathbb{V}},$$

$$w \in (b)\_{\mathbb{V}} \to \left\langle \nu, w \right\rangle \in b \to \mathfrak{B}\left[\left\langle \nu, w \right\rangle\right] \to w \in c.$$

Since <sup>=</sup> () and e[, ], we have e[, ()] as desired, fnishing the proof of the claim.

In view of (1) the set { <sup>∈</sup> : e[, ()]} exists by <sup>Δ</sup> separation and is the set we need according to (∗). □

The following is an immediate consequence of Theorem 5.5, Theorem 5.8, and the previous lemma.

**Corollary 5.10** *We have for all* L<sup>∈</sup> *formulas that*

KP + (*V*=*L*) + (Σ1*-*Sep) ⊢ ⇐⇒ KP + (*V*=*L*) + (Π1*-*Red) ⊢ .

In order to get rid of the axiom (*V*=*L*) on the left-hand side of the previous implication, we show that is an inner model of KP + (Σ1-Sep).

**Theorem 5.11** *If is the universal closure of an axiom of* KP + (Σ1*-*Sep)*, we have that*

$$\mathsf{KP} + (\Sigma\_1 \mathsf{Sep}) \vdash \varphi^L.$$

*Here is the result of restricting all unbounded quantifers in to .*

*Proof* In view of Barwise (1975) we only have to deal with the instances of (Σ1-Sep). So let [, , ] be a Δ<sup>0</sup> formula with all free variables indicated; we suppress mentioning additional parameters. Given elements , ∈ we have to show that

$$(\exists z \in L)(\forall x \in L)(x \in z \iff x \in a \land \exists y (y \in L \land \varphi([x, y, b]))).$$

By (Σ1-Sep) there exists the set

$$c = \{ \mathbf{x} \in a : \exists \mathbf{y} (\mathbf{y} \in L \land \varphi(\mathbf{x}, \mathbf{y}, b]) \}$$

and thus we have

$$(\forall x \in c) \exists \xi (\exists y \in L\_{\xi}) \varphi [x, y, b].$$

By Σ collection there is an such that

$$(\forall \mathbf{x} \in c)(\exists \xi < \alpha)(\exists \mathbf{y} \in L\_{\xi})\varphi[\mathbf{x}, \mathbf{y}, b]$$

and so, by the properties of the -hierarchy,

$$(\forall x \in c)(\exists y \in L\_{\alpha})\varphi[x,y,b].$$

This implies that = { ∈ : (∃ ∈ )[, , ]}. Hence is the required witness in . □

This theorem implies that KP + (Σ1-Sep) + (*V*=*L*) is conservative over KP + (Σ1-Sep) for formulas which are absolute w.r.t. KP + (Σ1-Sep), in particular for all arithmetical formulas.

**Corollary 5.12** *We have for all arithmetical formulas that*

$$\mathsf{KP} + (\Sigma\_{\mathsf{I}} \cdot \mathsf{Seq}) \vdash \varphi \iff \mathsf{KP} + (V \!= \mathsf{L}) + (\Pi\_{\mathsf{I}} \cdot \mathsf{Red}) \vdash \varphi.$$

If we summarize our results, we have the following proof-theoretic equivalences:

$$\mathsf{KPP} + (V \!= L) + (\Pi\_1 \mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{?}}}}}}}}}}}}}}}}}}}} \mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\mathsf{\prime$$

#### **6 Comments and questions**

This article did not discuss the theory KP<sup>0</sup> + (*V*=*L*) + (Π1-Red). There is the question of how to deal with the constructible hierarchy in KP<sup>0</sup> and whether there is an analogue of Theorem 5.8 with KP replaced by KP0. However, we are not sure whether this leads to something interesting.

Our real concern in the present context is the question of the strength of KP + (Π1-Red). We know that (Π1-Red) is not provable in KP. This can be seen as follows:


But is the proof-theoretic strength of KP + (Π1-Red) greater than that of KP? As a preparatory step for the analysis of the proof-theoretic strength of KP + (Π1-Red) it could be useful to check whether KP + (Π1-Red) proves (Π 1 1 -CA) or (Δ 1 2 -CA).

A diferent line of research is to look at reduction principles of the form as discussed above in theories of sets and classes. Some frst results are known, but in general this feld is wide open.

#### **References**


Gregoriades, V. (2019). On a question of Jaeger's. arXiv: 1905.09609.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Comments on the Contributions**

Peter Schroeder-Heister

**Abstract** The contributions to this volume represent a broad range of aspects of proof-theoretic semantics. Some do so in the narrower, and some in the wider sense of the term. Some deal with issues I have been concerned with directly, and some tackle further problems. All of them open interesting new perspectives and develop the feld in diferent directions. I will briefy comment on the signifcance of each contribution here.

**Sundholm on Frege's anticipation of the deduction theorem.** Two papers are historically oriented: Göran Sundholm's and to some extent that by Neil Tennant (which will be discussed later). Both deal with Frege. I welcome this very much, not only as Frege has been a main topic in my own historical studies, but also because Frege's work is of high systematic relevance to proof-theoretic semantics. He was the frst to develop a precise notion of logical deduction, based on the idea that we proceed from judgements already seen as true to new true judgement in a gap-free way. Even the sentences of his concept script (*Begrifsschrift*) can be read as a two-dimensional notation of sequents, so that his proofs proceed in a sort of sequent calculus (Schroeder-Heister 1997; 2014). Sundholm draws attention to the much neglected logical content of Section 17 of Frege's *Foundations of Arithmetic* (1884). He shows that Frege in a sense anticipates the deduction theorem as well as its proof in Hilbert-Bernays (1934). In pointing this out, he discusses Frege's notion of analyticity in detail, as the conditional statements generated by the deduction theorem represent analytical judgements. What I fnd particularly interesting, besides the wealth of discussion of related concepts such as axiom, tautology, self-evidence, topic-neutrality and the like, is the fact that Sundholm, at least implicitly, alludes to the fact that in Frege there is a notion of proof from assumptions, in addition to proofs of hypothetical judgements, because otherwise talking of the deduction

Peter Schroeder-Heister

Department of Computer Science, University of Tübingen, Germany, e-mail: psh@uni-tuebingen.de

<sup>©</sup> The Author(s) 2024 443

T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0\_17

theorem does not make much sense.1 Though there are several places, where Frege speaks against assumptions as a special mode of judgement (e.g., Frege 1906; 1923), in the passage discussed by Sundholm he speaks of a deduction which starts from a *fact* ("Tatsache"), that is, something that comes from outside, and argues that the content of this deduction can be expressed by an (analytical) hypothetical judgement. The judgement we would call an assumption (as the starting point of the deduction), fgures in Frege as the statement of an external fact. Formally, for the validity of the subsequent chain of deduction, it does not make any diference whether we start from an external fact or from an internally specifed assumption: we have the idea of something from which a logical deduction starts, and which can then become ingredient of the deduction theorem.2

**Tennant on Frege's class theory and the logic of sets.** Tennant argues that Frege's fundamental assumptions would have led to a system corresponding to Core Logic, the foundational system for logic and mathematics Tennant has advocated and developed for several decades. Core Logic is itself very interesting from the standpoint of prooftheoretic semantics, as it relies on specifc inferentialist assumptions concerning harmony, normal form etc. What makes it particularly signifcant is that is includes and accounts for crucial issues beyond standard proof-theoretic semantics, such as the notion of relevance as well as the non-insistance on transitivity (or cut in the sequent calculus). The latter is a point I have also put forward in several contexts, in particular in the development of defnitional refection and its application to paradoxes, a feld where Tennant and myself have similar intuitions (Tennant, 1982, Schroeder-Heister, 2012b)3. What is also important about Core Logic is that it is a free logic, a topic underdeveloped in standard proof-theoretic semantics (or only discussed in connection with the denotation of proofs, but not in connection with singular terms), as the whole area of denotation is not given the attention it deserves. If we base our theorizing not on denotation and truth but on inference and proof as basic concepts, we must be able to say how we can incorporate the denotational concepts. Core Logic is perhaps the best worked-out system in this direction, apart from constructive type theories in the Martin-Löf tradition. Though itself set-theoretic in spirit, with the latter it has in common that it introduces the terms and predicates needed through particular rules. In standard approaches to set theory, the existence of certain sets is postulated axiomatically, but the terms denoting these sets are not primitive symbols of the language itself but have a status similar to eliminable defnite descriptions. Tennant calls them "pasigraphs" as they are used throughout, even in

<sup>1</sup> When reading Frege-implications as sequents, an iterated implication with the antecedents ("Unterglieder") and and the succedent ("Oberglied") cannot be distinguished from one with the antecedent and succedent *-implies-* — it is just a diferent way of looking at the sequent, which means that a deduction theorem in this sense is built into Frege's notation (see Schroeder-Heister 1999; 2014).

<sup>2</sup> Frege obviously means *empirical facts* as he is making his remarks in the context of the discussion of induction, though his logical claim is completely independent of this context. His deduction theorem also holds when a deduction starts, for example, from a tautology or a contradiction.

<sup>3</sup> Even though concerning Ekman's paradox, our opinions diverge; see Schroeder-Heister and Tranchini (2017; 2021) and Tennant (2021).

the standard expositions, and insists that they be inferentially defned by introduction and elimination inferences.

**Prawitz on validity of inference and argument.** Proof-theoretic validity lies at the core of proof-theoretic semantics. Whether this programme will be successful depends in many respects on whether it can provide notions of validity that can compete with what is available in model-theoretic semantics. Defning a promising concept has been one of the central philosophical occupations of Prawitz's research after his groundbreaking work on natural deduction (Prawitz, 1965). The programme of defning validity has turned out more difcult than perhaps envisaged at the beginning of the 1970s. For a long time Prawitz has advocated a notion of validity of arguments or proofs, on which, as a secondary notion, a notion of validity of inferences could build (Prawitz 1971; 1973; 2006). I have myself tried to formally explicate this programme (Schroeder-Heister, 2006), though I have always tended to give the validity of inferences a conceptually primary status. For somewhat diferent reasons Prawitz now also favours such an approach. The big problem, however, is that the notions of validity of inference and validity of proof interact. If one wants to put validity of inferences frst and that of proofs second — the validity of proofs as depending on that of the inferences involved in the proofs — there is the problem that there needs to be some property whose transmission from premisses to conclusion of an inference establishes its validity. This property would be some sort of validity of proofs (in the model-theoretic case it is the notion of truth in a model). Prawitz and myself are both working on hopefully resolving this prima facie circularity. His contribution to this volume can be seen as an intermediary step towards a solution. It characterizes the notions of inference, validity, argument, canonicality etc. together with their interdependencies, without yet trying to bring these concepts into a wellfounded order. This is a step forward, as it sets up certain adequacy conditions these concepts have to obey. Currently it is not clear that in the end a well-founded order of concepts can be established. Perhaps the structural characterization of concepts and their interrelations is the best we can achieve: Proof-theoretic validity of inferences and arguments would then constitute a structure that can materialize in diferent ways. As this is work still under way, I leave the discussion here at this general level, and just stress its fundamental character.

**De Campos Sanz on Kolmogorov and the theory of problems.** Starting from Kolmogorov's interpretation of logical propositions as problems for which a solution is sought, de Campos Sanz arrives at a couple of distinctions that are directly relevant to proof-theoretic semantics. He presents a "reduction semantics", according to which the consequence from to is interpreted as the reduction of the problem to the problem , and proves its adequacy with respect to intuitionistic logic. Applying this apparatus to problem solving through constructions in elementary geometry leads him to consider additional logical constants, an example of which is what he calls "before-after conjunction", which is intended to capture the order in which constructions are carried out. According to de Campos Sanz it cannot be defned in terms of standard logical constants but should nevertheless be conceived as logical. He is even led to consider, besides the notion of *assumption* in the sense that a solution to

a problem is supposed to be available, a more general notion of *hypothesis* in *reductio proofs* (which are abundant in geometry). From my point of view, Kolmogorov's *problem interpretation* of logic should be considered in connection with reductive approaches such as tableau systems and dialogical interpretations — in this sense de Campos Sanz's usage of the term "reduction semantics" is highly accurate. At a more general level, investigations such as de Campos Sanz's show us that other forms of argumentation together with other sorts of concepts come into play when we leave the narrow path of standard logical systems. This opens perspectives which are still way beyond the grasp of proof-theoretic semantics as it stands now.

**Pereira, Haeusler and Nascimento on disjunctive syllogism and** *ex falso***.** The paper by Pereira, Haeusler and Nascimento presents two systems between minimal and intuitionistic logic in which *ex falso quodlibet* is not a consequence of disjunctive syllogism. This is a signifcant contribution to a discussion in logic since antiquity, as disjunctive syllogism is considered a *prima facie* plausible rule, whereas the *ex falso* principle does not have this degree of plausibility. The frst system is a variant of Tennant's intuitionistic relevance logic, but without the relevance restriction concerning assumptions, the second is an adaptation of an intuitionistic multipleconclusion system (developed by the authors in earlier work) to a single conclusion system, where the visibility of assumptions is restricted in a certain way. This is immediately relevant to proof-theoretic semantics as it gives us novel options of framing deductive reasoning which go beyond standard natural deduction (as the formal paradigm of proof-theoretic semantics). The frst option, which the paper shares with Tennant, gives the absurdity constant a kind of structural role, when it occurs in generalized elimination inferences. The second option and its associated notion of "visibility" gives an intuitively more perspicuous rendering of what otherwise would be achieved by a not-so-perspicuous multiple conclusion system. *Visibility* might perhaps be considered to be an additional basic concept in the defnition of a proof structure, which certainly can be extended to the general case of arbitrary -ary connectives and even to general reasoning with inductively defned objects. The paper also shows that intuitionistic and classical logic are not the only logical systems to be discussed, and that an intermediate system may be more faithful to our semantical intuitions. This corresponds to observations also made in Piecha and Schroeder-Heister (2019) that intuitionistic logic is not the logic of standard proof-theoretic validity, and that a logic weaker than classical but stronger than intuitionistic logic might be taken into consideration.

**Indrzejczak on the logicality of equality.** In addition to being an encompassing overview of many treatments of equality in natural deduction and in the sequent calculus, this paper shows that equality can indeed be considered a logical constant in the sense of Došen's (1989) proposal using double line rules. However, Indrzejczak does not take over Došen's own proposal concerning equality rules, which for Indrzejczak is of a 'global' nature, since rephrasing it's rules in natural deduction would need a rewriting of a whole proof. He prefers a 'local' variant with 'free' [my terminology] predicate constants obeying the side condition that they do not occur otherwise in the sequent. I would consider this a hidden second-order treatment of

equality which efectively expresses Leibniz equality. This shows the strength of Leibniz's concept if one wants to obtain a 'local' notion. It also points to the fact that the relationship between frst- and second-order concepts is a critical topic of proof-theoretic semantics4, which is far from being settled and which requires further research. At the same time Indrzejczak points to the unclear relationship between logicality and semantics. Even though many (such as Došen and also myself in several publications, e.g., Schroeder-Heister, 1984b) have tried to keep these two issues apart, the criteria for logicality, when formulated in terms of inference rules, are often of a kind that they can be read as meaning conferring and thus as semantical conditions.

**Arndt on rules for implication elimination.** Arndt discusses diferent ways of formulating the implication elimination rule in natural deduction. In an earlier publication (Arndt, 2019), he distinguished eight possible variants of implication rules in the sequent calculus. They difered in the way they could be derived, by means of the cut rule, from an axiomatic sequent expressing *modus ponens*. These variants included the standard implication-left rule as well as my proposal for a revised implication-left rule (Schroeder-Heister, 2011). Here this is carried over to natural deduction, where the role of cut is now taken by a rule of explicit composition (which is related to explicit substitution, see Abadi, Cardelli, Curien, and Lévy, 1991; Arndt and Tesconi, 2014). Furthermore, a notational device is added that forces certain premisses of rules to be assumptions (to "stand proud" in Tennant's (1992; 2002) terminology). The result is a congruence between natural deduction and the sequent calculus which puts them in much closer parallel than the usual translations between these types of systems. At the same time it gives a proper understanding of bidirectionality in natural deduction, which I had proposed (at least programmatically) in Schroeder-Heister (2009). Although these are observations at the syntactical level of formal systems, they are highly relevant to proof-theoretic semantics, as the way reasoning is framed depends on which options we have in unterstanding rules, assumptions, proof composition and the like. This is the frst investigation I am aware of, which takes subtle diferences in the formulations of inference rules into account by making explicit the possible diferences in the status of their premisses.

**Liang and Miller on focusing Gentzen's LK.** The contribution of Liang and Miller gives, from the perspective of proof search, a presentation of the authors' "focused" sequent calculus which overcomes weaknesses of Gentzen's classical sequent calculus LK. It is well-known that the reduction procedures for LK, as applied, for example, in cut elimination proofs, are not very deterministic due to the fact that a great number of permutations between inferences are possible. This makes the sequent calculus difer from the calculus of natural deduction, which, however, has other defciencies when it comes to proof search. In order to keep the advantages of the sequent calculus for computational purposes, it is here expanded to a system which is syntactically more involved by considering atoms and connectives of diferent polarities and, correspondingly, two diferent sequent arrows. One is compensated for this by a more streamlined and more deterministic structure of proofs which is not

<sup>4</sup> See also, in another context, the contribution by Pistone and Tranchini (2023, this volume).

only useful for computational purposes but also for demonstrations of metalogical results such as Herbrand's theorem. It generalizes systems of a related kind developed by Girard, Andreoli and others. This paper shows that the area of proof search, which at the time of the origin of proof-theoretic semantics was much neglected, is becoming an integral part of it in the sense that proof-search aspects are built into the semantically-understood inferences.

**Pistone and Tranchini on intensional harmony as isomorphism.** The concept of harmony is central in the proof-theoretic semantics of natural deduction. That the consequences of elimination rules of a logical sign have the same deductive power as the conditions of its introduction rules is often seen as a justifcation of these rules. Harmony also gives rise to identities between proofs in the sense that certain successions of introductions and eliminations can be "reduced" or "contracted", yielding a proof which is still considered the same as the original one. In this sense, harmony is the basis of an intensional proof-theoretic semantics based on the notion of identity of proofs. Because a general defnition of harmony was a desideratum, in Schroeder-Heister (2015) I developed an approach which translated the conditions of introduction rules as well as consequences of eliminations rules into formulas of second-order logic with propositional quantifcation and defned harmony as logical equivalence of these translations. Pistone and Tranchini point out that equivalence is too weak a notion for an appropriate intensional notion of harmony and that some sort of isomorphism of these translations is needed. This is not available in the metatheory of second-order logic based on beta and eta reduction. As a solution, the authors propose an additional so-called "epsilon reduction" which is based on the idea that there is exactly one proof of polymorphic identity in second-order logic. This is a major step forward beyond what I had proposed, as for many cases it leads to a plausible notion of harmony. It also demonstrates the signifcance of second-order propositional logic for the proof-theoretic semantics of elementary propositional logic.

**Wansing on synonymy.** The intensional notion of proof identity induces a notion of isomorphism between propositions. One would consider and isomorphic to one another if there are proofs of from and of from , such that the composition of these proofs yields the identity proof of from and of from . "Yields" here means that the given composition of proofs is identical (in the sense of proof identity) to the identity proof. Wansing, who is working in a bilateral framework of proofs and disproofs, gives a diferent defnition of synonymy (his term for isomorphism). For him and are synonymous, if there are identical proofs from to and from to , as well as identical disproofs between these propositions. His notion of proof identity does not require that the propositions proved (and assumed) are the same between identical proofs, which is against a principal *tenet* in standard intensional proof-theoretic semantics. Identity in Wansing's sense is defned by a structural correspondence of sequent-style proofs. This defnitely gives novel incentives to the discussion on the identity of proofs, both from the structural point of view (identical proofs for diferent propositions), but also from the consideration of the sequent calculus and the appeal to bilateralism, which are very much neglected in current

discussions of proof identity. So far, only the discussion of the principal type of a lambda term as representing the structure of a proof comes close to the idea that there can be identical proofs (proofs of identical structure) of diferent theorems (see Hindley, 1997; Rezende de Castro Alves, 2019). In the discussion of identity of proofs and thus in intensional proof-theoretic semantics (see Tranchini, 2023), there are many conceptual aspects still open, and Wansing is providing a fresh look at some of these.

**Kahle and Santos on paradoxes.** Kahle and Santos discuss the relationship between the conceptual constructions of logical, semantical and set-theoretic paradoxes and the logic used to derive a contradiction. It is the logic which renders these constructions paradoxical. On the other hand, it is extremely difcult to make specifc logical features responsible for the paradoxical outcome — just passing to another logical system, for example from classical logic to a logic without excluded middle such as intuitionistic logic does not change the situation. What changes it are global considerations such as normalization requirements in the sense of Prawitz (1965, Appendix B) and Tennant (1982). Therefore Kahle and Santos plead to further scrutinize the conceptual constructions of the paradoxes, but from a consequentialist point of view, for which I have argued myself (based on ideas of Lars Hallnäs): not restricting conceptual defnitions themselves, but classifying defnitions according to possible consequences including the non-eliminability of cut etc. (Schroeder-Heister, 2012b). However, while Kahle and Santos discard the reference to substructural logics to avoid the paradoxes and criticize some attempts I made in this direction (Schroeder-Heister 2012a; 2016), I see an option that limits the rule of contraction. This is a limitation not in the global substructural, but in a local intensional sense (see Schroeder-Heister, 2022).

**Hallnäs on the structure of proofs.** As indicated in my autobiographical survey (Schroeder-Heister, 2023, Section 7), I owe to Lars Hallnäs many ideas I consider relevant for current proof-theoretic semantics, and even more relevant for its future development. His idea of defnitional refection, that is, the idea of a general principle to extract information from defnitions, which can be partial and are not necessarily monotone, goes way beyond logic and has shaped my understanding of proof-theoretic semantics: not only because this approach represents a powerful extension of logic programming, as we presented it initially (Hallnäs and Schroeder-Heister, 1990), but, more importantly, because it constitutes a general reasoning principle from an intensional point of view. Hallnäs's contribution to this volume sketches the direction into which this might lead when we not only consider the function closure but the functional closure of defnitions. Already the structure of natural deduction with its concept of assumption discharge and corresponding side conditions makes such an approach reasonable. The most original idea in this paper is the characterization of proofs in terms of their reductive behaviour, which allows Hallnäs to compare and identify proofs in diferent formal systems as this behaviour is independent of the formal system itself. I would look at it as an attempt to formalize the concept of 'proof idea', something that every mathematician is aware of and that is the driving force in defning the identity of proofs, but that has not really progressed so far. Hallnäs

calls it a "structure theory of proofs" which is much more abstract than "structural proof theory" (Negri and von Plato, 2001) (which partly overlaps with what Prawitz (1971; 1972; 1973) called "general proof theory"). Using his general operators on proofs, he discusses in particular what I have called "Ekman's paradox", a topic of his former doctoral student Jan Ekman to which Hallnäs drew my attention in the early 1990s and which has fascinated me ever since, leading to recent work with Tranchini (Schroeder-Heister and Tranchini 2017; 2021). He can formally demonstrate that both Ekman's normalization paradox and Russell's set-theoretic paradox, though formulated in diferent formal systems, are based on the same idea, as they satisfy the same abstract proof equation. Hallnäs's analysis actually gives a structural rendering of Ekman's reduction of proof terms, which from the semantical perspective applied by Tranchini and myself remains invisible. The application of such general tools is a promising method in advanced intensional proof-theoretic semantics. This holds likewise for the advanced second-order tools used by Pistone and Tranchini (2023, this volume).

**Francez and Kaminski on truth-value constants in multi-valued logics.** The contribution by Francez and Kaminski can be seen as an application of prooftheoretic semantics to a system where formulas are signed with truth values. It is thus a generalization of bilateral systems, where one uses positively and negatively signed formulas, to the case of fnitely many truth values. The elimination rule of the system formulated in sequent-style natural deduction corresponds to the general elimination rule proposed by Prawitz (1979) and myself (Schroeder-Heister, 1984a), but is now derived from introduction rules based on the truth functional meaning of the connective considered. The paper discusses in particular the case of the nullary constants truth and falsity and their generalizations to arbitrary truth values, and establishes that we have explosion rules for them corresponding to *ex falso quodlibet* in the case when the nullary constant is signed with a non-matching truth value. This shows that proof-theoretic semantics can be productively applied in the area of multi-valued logic and is not confned to intuitionistically inspired logics.

**Więckowski on counterfactual assumptions and implications.** Więckowski applies proof-theoretic semantics to causal and counterfactual reasoning, more precisely to reasoning from assumptions where assumptions are either *factual* ("since is the case, is the case", i.e., " is the case, because is the case") or *counterfactual* ("if were the case, would be the case"). He overcomes my general characterization of assumptions in natural deduction as "unspecifc", which I used to distinguish assumptions in natural deduction from those in bidirectional sequent-based reasoning (Schroeder-Heister, 2004). His idea is to use two proof systems: a "reference system" which is used to infer the assumptions of the "modal system". When the reference system derives the assumption in a canonical way, the modal consequence is a factual or causal inference, while if this is not the case, the modal consequence is counterfactual. Thus the reference system allows one to distinguish between a derivation from an accepted assumption, from a non-accepted assumption, and from an unspecifc assumption just laid down. As his reference system he chooses subatomic natural deduction as proposed by himself (Więckowski, 2011), which is particularly

suited to deal with identity assumptions in the modal system. This approach is a further step (there are not so many yet5) to make proof-theoretic semantics fruitful for the investigation of intensional natural-language phenomena.

**Bärtschi and Jäger on set-theoretic reduction principles.** The contribution by Bärtschi and Jäger sits on the border between reductive and general proof theory. It investigates the strength of so-called separation principles in second-order arithmetic, which allow one to distinguish two disjoint unary formulas by means of a set containing instances of the frst but no instance of the second. These principles play an important role in reverse mathematics. Under the name "reduction principles" (to distinguish them from set-theoretic "separation axioms") they are investigated here with respect to set-theoretic laws, in particular in Kripke-Platek set theory as compared to systems with transfnite recursion. This demonstrates how much remains to be done in prooftheoretic semantics to achieve signifcant results of mathematical proof theory, given that Kripke-Platek set theory is related to theories of inductive defnitions. For me inductive defnitions are a key topic in a proof-theoretic semantics with defnitional refection, in particular when functional closure as in Hallnäs's (2023) contribution (this volume) is taken into account.

To conclude, when I look at the breadth and depth of the content of these essays, I feel confrmed in my assessment that proof-theoretic semantics has a bright future.

**Acknowledgements** I am very grateful to the authors for their contributions, and to Thomas Piecha and Kai Wehmeier for compiling and editing this volume.

#### **References**


<sup>5</sup> One is the second half (Part II) of Francez's (2015) book on proof-theoretic semantics, which deals with applications in linguistics, as well as later work of this author such as Francez (2022).


Open Access This chapter is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-sa/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. If you remix, transform, or build upon this chapter or a part thereof, you must distribute your contributions under the same license as the original.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Index**

#### 2Int, 357

Abadi, Martin, 447 Aczel, Peter, 9 admissible sets, 434 Adorno, Theodor W., 4 Ajjanagadde, Venkat, 21 Almukdad, Ahmad, 344 analytic, 62 Andreoli, Jean-Marc, 276, 310, 448 application of inference, 140 Aristotle, 13, 56, 142, 143 arithmetical, 427 arithmetical comprehension, 427 arithmetical transfnite recursion, 428, 429 Arndt, Michael, 14, 28, 29, 447 assertion, 136 assertive reasoning, 269 assumption, 136, 400 assumptive reasoning, 269 atomic systems, 27 Avron, Arnon, 28, 215 axiom of countability, 432

Béziau, Jean-Yves, 24–25 Baaz, Mathias, 222 bar induction, 428 basic set theory, 430 before-after conjunction, 187, 445 Behmann, Heinrich, 371 Bell, John, 220 Benedict XVI, 4 Bernays, Paul, 3, 366, 443 BHK interpretation, 343 BHK-interpretation, 144 BHK-proof, 145, 146 bidirectional natural deduction, 268–271 bilateralism, 24, 25, 343, 448 Binder, David, 11, 229 binding, 136 Boghossian, Paul, 136 Bolzano, Bernard, 5, 141 Boole, George, 13 Bourdeau, Michel, 29 Brandom, Robert, 2 Brouwer, L. E. J., 197, 367, 368 Brouwer, Luitzen Egbertus Jan, 152 Bubner, Rüdiger, 20 Buchholz, Wilfried, 3 Buhl, Günter, 5 Buldt, Bernd, 8 Bärtschi, Michael, 451

calculus of squares, 24 canonical identity statement, 86, 93 Carnap, Rudolf, 8, 317, 329, 341 characteristic aim, 136 Chateaubriand, Oswaldo, 31

© The Editor(s) (if applicable) and The Author(s) 2024 T. Piecha and K. F. Wehmeier (eds.), *Peter Schroeder-Heister on Proof-Theoretic Semantics*, Outstanding Contributions to Logic 29, https://doi.org/10.1007/978-3-031-50981-0

co-implication, 357 cognitive science, 20 composition, 148 comprehension principles, 426 conditional, 56 congruence relation, 353 constructive type theory, 9 contradiction, 393, 395 Contu, Patrizio, 22, 26 Coquand, Thierry, 14 Core Logic, 86, 88, 93, 94, 132, 444 counterfactual assumption, 409, 450 counterfactual implication, 408, 450 counterfactuals, 400, 450 counterpossibles, 417 Cozzo, Cesare, 137, 156, 159 Curry, Haskell, 365 Curry-Howard correspondence, 320 cut-elimination theorem, 288

de Campos Sanz, Wagner, 27, 344, 445 deduction theorem, 60, 443 defnitional freedom, 370, 371 defnitional reasoning, 16 defnitional refection, 16–17, 23–24 defnitions, 376 Degtyarev, Anatoli, 219 dependency, 138 Devlin, Keith, 3 di Cosmo, Roberto, 341 Dialectica interpretation, 3 dialogical logic, 28 Diller, Justus, 3 Dingler, Hugo, 6 direct negation, 24 Dirichlet, Gustav Lejeune, 4 disjunctive syllogism, 193 disproof, 344 DLMPST, 31 Došen, Kosta, 17–19, 26–28, 30, 31, 211, 228, 318, 321, 341, 446, 447 double-barreled abstraction, 95

double-line rules, 17, 446

Duži, Marie, 30 Dummett, Michael, 5, 13, 17, 25, 29, 58, 145, 151–153, 316 Dunn, Michael, 18, 19 Dyckhof, Roy, 18, 26 Däubler-Gmelin, Herta, 30

Egli, Urs, 7 Ehrenstein, Walter, 12 Ekman's paradox, 383 Ekman, Jan, 23, 450 encyclopedia, 6 equality, 212, 446 Eriksson, Lars-Henrik, 24 Euler, Leonhard, 6 evidence, 142 ex falso quodlibet, 193, 446, 450 explicit composition, 242, 246, 248, 253, 257, 258, 260, 263–265, 268, 270, 271 explosion, 392–395 exposed normal form, 255 expressive completeness, 9, 10 extensionality, 88, 101, 102, 107, 111 extensions of logic programming, 16–17

Fackeldey, Hubert, 4 factual assumption, 409, 450 factual implication, 408, 450 falsum (⊥), 391, 394, 395 Feferman, Solomon, 3 Felgner, Ulrich, 3 Felscher, Walter, 3 Feyerabend, Paul, 15 Fichot, Jean, 29 fnal conclusion, 140 focused proof system, 276 focusing, 447 Francez, Nissim, 1, 32, 450 free logic, 87, 91, 92, 100, 102, 104, 110, 112, 113, 116, 121, 128 Frege conditional, 54 Frege, Gottlob, 5, 8, 12–13, 54, 85–91, 96–100, 102, 103, 105, 106, 111, 136, 139, 443–445

Friedrichsdorf, Ulf, 8 functional closure, 376, 377 functional completeness, 9 Gabbay, Dov, 14, 19, 22 Gallier, Jean, 220 Gazzari, René, 29, 234 general elimination rule, 239, 242, 247, 248, 253, 254, 256–258, 261, 263, 265, 268 general logical laws, 64 general proof theory, 450, 451 generalized elimination rules, 9 generalized -expansions, 321 generic, 137–139 Gentzen semantics, 1, 9 Gentzen, Gerhard, 1, 30, 59, 136, 140, 151, 152, 154, 276, 316, 447 Girard, Jean-Yves, 19, 27, 279, 309, 344, 448 Gratzl, Norbert, 229 Grelling, Kurt, 11, 363 Grifths, Owen, 219 Grishin, Viacheslav, 19 ground, 142 Guenthner, Franz, 18–20 Gödel, Kurt, 2, 3, 10, 370 Hacking, Ian, 211, 224 Haeusler, Edward Hermann, 446 Hallnäs, Lars, 14, 16–17, 23, 368, 370, 449, 451 harmony, 26–27, 319, 395, 448 Hasenjaeger, Gisbert, 2–3, 9, 10 Heidegger, Martin, 5 Heinzmann, Gerhard, 31 Heister, Gabriele, 6, 12, 20 Henkin, Leon, 371 Herbrand's theorem, 307, 448 Herbrand, Jacques, 60 hereditarily countable, 432 Herre, Heinrich, 30 Hertz, Paul, 13, 30 Heyting, Arend, 144, 145 Heyting-Brouwer logic, 357 higher-level rules, 9

Hilbert's tenth problem, 2 Hilbert, David, 363, 364, 368, 370, 371, 443 Hindley, J. Roger, 449 Hintikka, Jaakko, 212 Hoering, Walter, 21, 30 Horkheimer, Max, 4 Hudelmaier, Jörg, 18, 21 Huet, Gérard, 14 Husserl, Edmund, 8 Hyland, Martin, 14 hypersequent, 17 hypo semantics, 168 hypothetical reasoning, 25 identity, 212 identity of proofs, 318, 328, 448 implication, 56 implication elimination, 447 implications-as-links, 245, 248 implications-as-rules, 243, 245 impredicative encodings (Russell-Prawitz translation), 318, 334 incompleteness, 27 inconsistency, 393 Indrzejczak, Andrzej, 234, 446 inference, 58 inference rule, 322 inherited identity, 351 intensional proof-theoretic semantics, 23–24, 448, 449 intuitionistic, 144, 155 inversion principle, 154 isomorphism, 318, 320, 327, 333, 448 Jaśkowski, Stanisław, 136, 234 Jensen, Ronald Björn, 3 Jäger, Gerhard, 30, 451 Jüngel, Eberhard, 4 Kahle, Reinhard, 22, 28–30, 212, 449 Kalish, Donald, 217 Kaminski, Michael, 450 Kanger, Stig, 219 Kant, Immanuel, 4

Kasper, Walter, 4 Keronen, Seppo, 21 Keuth, Herbert, 21 Kleene, Stephen C., 2, 365 Klev, Ansten, 219 Kolmogorov, Andrei N., 11, 161, 163, 165, 169, 170, 172, 176, 445 Koppelberg, Sabine, 3 Koslow, Arnold, 229 Kripke, Saul, 24 Kripke-Platek set theory, 430, 434, 451 Kuhn, Thomas S., 12 Käsemann, Ernst, 4 Küng, Hans, 4 Kürbis, Nils, 229

López-Escobar, Edgar, 26, 344 Lambek calculus, 19, 21 Lambek, Joachim, 17, 19 law of excluded middle, 100 laws of logic, 74 Leibniz Axiom, 215 Leibniz equality, 447 Leibniz, Gottfried Wilhelm, 13 Leitsch, Alex, 222 Leiß, Hans, 3, 7 Liang, Chuck, 447 linear logic, 276 LKF, focused version of LK, 276, 282 Lloyd, John W., 14 located constants, 394 located formula, 392 located sequent, 392 logic and cognition, 20–21 logic in philosophy, 22 logic programming, 14 logical framework, 14 logical ground inference, 261, 263, 264 logical ground sequent, 240, 249, 264 Logik der Forschung, 4 Loos, Rüdiger, 21 Lorenzen, Paul, 5 Lotze, Hermann, 13

Löwe, Benedikt, 28 Lüdecke, Rainer, 29 Maas, Wolfgang, 3 Machover, Moshe, 220 Manzano, Mara, 212 Martin-Löf type theory, 14, 15 Martin-Löf, Per, 9, 10, 12, 14, 29, 30, 142, 145, 146, 153, 218, 320, 370, 444 Maruyama, Yoshihiro, 229 Materna, Pavel, 30 Mates, Benson, 215 mathematical induction, 141 Matiyasevich, Yuri, 2 Maurer, Harald, 29 McClelland, James, 20 meaning, 141, 143 Merrill, Daniel, 13 Meyer, Paul Georg, 4 Meyer, Robert K., 19 Miller, Dale, 14, 17, 447 Miller, David, 11 minimal logic, 193 Minsky, Marvin, 15 Mints, Grigori, 26, 29, 31, 219 Mittelstraß, Jürgen, 6–8, 13, 15 modal proof system, 408 mode of assumption, 409 model-theoretic semantics, 1 Moltmann, Jürgen, 4 Montague grammar, 7 Montague, Richard, 217 Moreno, Manuel, 212 mountweazel, 7 multi-valued logics, 391, 395, 450 multifocused proof system, 310 Müller, Gert, 3

N4, 345, 346 Nagashima, Terui, 222 naive comprehension, 100 Nascimento, Victor, 446 natural deduction, 59, 151, 154, 158, 194, 393 negation by failure, 24

Index 461

Negri, Sara, 220, 450 Nelson, David, 344 Nelson, Leonard, 363 Newell, Allen, 15 Nihilartikel, 7 non-creative, 157 non-primitive identity, 402 Oestermeier, Uwe, 21 Oliveira, Hermogenes, 29 Olkhovikov, Grigory, 26 Ono, Hiroakira, 19 Orlandelli, Eugenio, 229 paradox, 23–24, 449 Curry's, 365–367 Ekman's, 383, 444, 450 Girard's, 370 Grelling–Nelson, 364, 367 Kleene–Rosser, 365 Liar, 365–368, 370 of inference, 58 Richard, 365 Russell's, 87, 88, 96, 98–100, 102, 111, 364–368, 382, 450 parametricity, 329 Parlamento, Franco, 215 pasigraph, 86, 112–122, 124–127, 130, 131, 133, 444 Paulson, Lawrence C., 14 Pearce, David, 19, 30 Peppinghaus, Benedikt, 3 Pereira, Luiz Carlos, 30, 446 perseity, 64 Petrić, Zoran, 341 Pfeifer, Helmut, 3 Π 1 reduction, 429 Piecha, Thomas, 10, 11, 27–30, 229, 344, 446 Pistone, Paolo, 27, 448, 450 Plotkin, Gordon, 14 Poggiolesi, Francesca, 214, 342 Pohlers, Wolfram, 3 polarization, 279 Popper, Karl R., 4, 7, 8, 11–13, 21, 228

practical proposition, 175 Prauss, Gerold, 5 Prawitz's completeness conjecture, 10 Prawitz, Dag, 1, 9–10, 15, 17, 25, 27, 29, 30, 32, 89, 206, 316, 319, 325, 340, 445, 449, 450 Prestel, Alexander, 2, 8, 9, 12, 15 Previale, Flavio, 215 probability, 15 problem, 175 problem interpretation, 161 PROLOG, 14, 16 proof equations, 381 proof search, 448 proof structure, 378 proof unfolders, 380 proof-theoretic semantics, 416 Proops, Ian, 59 proudness marker, 242, 252, 253, 255, 263, 264, 271 pure sets, 92, 96, 105, 111 Pym, David, 10, 22 quantum disjunction, 322, 334 Quine, Willard Van Orman, 216

Ratzinger, Joseph, 4 Read, Stephen, 18, 218, 369 reduction principles, 426 reduction procedure, 93, 94 Reduction Semantics, 166 reduction semantics, 168 Reeves, Steven, 222 reference proof system, 409 refexivity of identity, 91, 99, 109 refutation, 344 Rejewski, Marian, 2 Restall, Greg, 234 reverse mathematics, 426 Reyle, Uwe, 14 Rezende de Castro Alves, Tiago, 29, 449 Robinson, Edmund, 22 Rosser, John Barkley, 365 rule of atomic denotation, 91, 93, 95, 97, 99, 105, 109, 113, 121

rule of functional denotation, 113, 116, 120 rules for set theory, 132 Rumelhart, David, 15, 20 Russell's paradox, 87, 88, 96, 98–100, 102, 111, 364–368, 382, 450 Russell, Bertrand, 364, 371 Sambin, Giovanni, 19, 28, 229 Sandqvist, Tor, 10, 27 Santos, Paulo Guilherme, 449 Scedrov, Andre, 19 Schaefer, Frank, 12 Schnorr, Claus Peter, 12 Scholz, Heinrich, 2 Schroeder-Heister, Peter, 54, 136, 145, 147, 154, 156, 159, 212, 219, 368–371, 399 Schröder, Ernst, 13 Schwabhäuser, Wolfram, 2, 3 Schwichtenberg, Helmut, 3, 9, 30, 221 Schütte, Kurt, 3 Scott, Dana, 17 second order arithmetic, 426 second-order propositional intuitionistic logic (System F), 316, 329 Seligman, Jerry, 219 semantical completeness, 10 separation principles, 426 sequent calculus, 212 set-abstraction operator, 86, 87, 96, 100, 102, 112, 115 set-theoretic reduction, 451 Shastri, Lokendra, 21 Σ 1 reduction, 429 Simon, Herbert, 15 simple inference, 148 Simpson's set theory ATR 0 , 430 single-barreled abstraction, 95, 110 Slaney, John, 18 Sneed, Joseph D., 12 Solovay, Robert M., 8 Soloviev, Sergei, 19

solution, 175

soundness, 142 speech act, 136 square of opposition, 24 Stålmarck, Gunnar, 18 stand proud, 249, 252–258, 261–266, 271 Stegmüller, Wolfgang, 2, 12 Sternefeld Wolfgang, 8 Stresius, Lothar, 4 structural proof theory, 450 structure of proofs, 449 structure theory of proofs, 375 subargument, 138 subatomic natural deduction, 450 subatomic system, 403 subformula property, 347 subformula sequent calculus, 347 substructural logics, 17, 19–20 Sundholm, Göran, 10, 19, 136, 140, 142, 144, 443 Suppes, Patrick, 218 support, 136 syllogism, 142 synonymy, 339, 340, 342, 354, 448 synthetic rules, 282 Tait, William, 29 Takeuti, Gaisi, 220 Tarski semantics, 1 Tarski, Alfred, 1, 60, 141, 217 Tennant, Neil, 18, 26, 201, 229, 369, 444, 446, 447, 449 term assumption, 403 Tesconi, Laura, 447 Textor, Mark, 234 Thiel, Christian, 5, 12, 13 Tichý, Pavel, 54, 136 tonk, 315 Tranchini, Luca, 23, 27, 29, 369, 370, 444, 448–450 Tristan chord, 15 Troelstra, Anne S., 3, 144, 145, 221, 343 truth-value constants, 391, 394, 395, 450

Index 463

truthmaker, 86 truths of fact, 71 truths of reason, 71 Turing, Alan, 2 two-sorted typed -calculus, 358

Urelemente, 87, 88, 95, 96, 104–106, 108, 109, 111

validity immediate, 143 mediate, 143 proof-theoretic, 445 van Atten, Mark, 197 van Beethoven, Ludwig, 15 van Benthem, Johan, 19 van Dalen, Dirk, 3, 144, 145 variable-binding term-forming operator, 86 Veloso, Paulo, 166, 182 verum (⊤), 391, 394, 395 virtual set theory, 132 Vogel, Martin, 6, 15 von Helmholtz, Hermann, 6 von Kutschera, Franz, 1, 9, 13, 54, 136 von Mises, Richard, 12

von Oettingen, Arthur, 6 von Plato, Jan, 26, 31, 220, 450 von Stechow, Arnim, 8 Voronkov, Andrei, 219 Wald, Abraham, 12 Wang, Hao, 219 Wansing, Heinrich, 1, 30, 32, 214, 234, 448 Wehmeier, Kai, 13, 29, 55, 212 Weingartner, Paul, 22 Welchman, Gordon, 2 Wette, Eduard, 3 Weyl, Hermann, 364, 371 Więckowski, Bartosz, 29, 30, 218, 450 Widebäck, Filip, 24 Wittgenstein, Ludwig, 5, 212 Wolters, Gereon, 6–7, 30 World Logic Day, 25, 32 Wundt, Wilhelm, 13 Zermelo, Ernst, 364

Zermelo-Fraenkel set theory, 434 Zimmermann, Ernst, 21, 30, 31 Zimmermann, Thomas E., 8